Article ID Journal Published Year Pages File Type
5132276 Chemometrics and Intelligent Laboratory Systems 2017 12 Pages PDF
Abstract

•The approaches for generation of multivariate data are reviewed and compared.•The problem of multivariate data generation from random covariance matrices is analyzed.•A new algorithm for multivariate data generation from random covariance matrices is proposed and made available.•The algorithm is illustrated in problems of interest for the research community.

The simulation of multivariate data is often necessary for assessing the performance of multivariate analysis techniques. The random generation of multivariate data when the covariance matrix is completely or partly specified is solved by different methods, from the Cholesky decomposition to some recent alternatives. However, many times the covariance matrix has to be generated also at random, so that the data simulation spans different situations from highly correlated to uncorrelated data. This is the case when assessing a new multivariate analysis technique in Montecarlo experiments. In this paper, we introduce a new algorithm for the generation of random data from covariance matrices of random structure, where the user only decides the data dimension and the level of correlation. We will illustrate the application of this algorithm in several relevant problems in multivariate analysis, namely the selection of the number of Principal Components in Principal Component Analysis, the evaluation of the performance of sparse Partial Least Squares and the calibration of Multivariate Statistical Process Control systems. The algorithm is available as part of the MEDA Toolbox v1.1.1

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
,