Article ID Journal Published Year Pages File Type
5132309 Chemometrics and Intelligent Laboratory Systems 2017 5 Pages PDF
Abstract

•The choice of data window is not unique for periodic variables, e.g. angular ones.•The choice is determined by conventions where results often turn to be unreliable and data evaluation works unsatisfactory.•We propose the selection of the periodic section to a pattern based method, where data window with smallest variance is used.•The method is justified by theoretical argumentation, calculations on Ramachandran plots of triglycine and data of PM10.

In the case of periodic variables, e.g. angular and temporal ones, the choice of the periodic section where experimental or calculated data are recorded is usually driven by convention. Unfortunately, basic statistics (mean, variance, covariance) and many multivariate statistical data evaluation methods (principal component analysis, clustering and classification methods, regressions …) provide different results for different data windows. We propose to change the selection of the periodic section to a data pattern based method, where that data window is selected which provides the smallest variance for the data. The use of smallest variance can be theoretically supported with the maximum likelihood principle. We show the advantages of the minimum-variance data window on two Ramachandran plots of simulated triglycine dihedral angles, where the unambiguously calculated means and clustering results are in agreement with the expectations of common sense. The second example concerns the enhanced effectivity in clustering of a temporal distribution of PM10 air pollutant, if the proposed data window is applied.

Graphical abstractDownload high-res image (219KB)Download full-size image

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , ,