Article ID Journal Published Year Pages File Type
5132259 Chemometrics and Intelligent Laboratory Systems 2017 12 Pages PDF
Abstract

Principal component analysis (PCA) is a widely accepted procedure for summarizing data through dimensional reduction. In PCA, the selection of the appropriate number of components and the interpretation of those components have been the key challenging features. Sparse principal component analysis (SPCA) is a relatively recent technique proposed for producing principal components with sparse loadings via the variance-sparsity trade-off. Although several techniques for deriving sparse loadings have been offered, no detailed guidelines for choosing the penalty parameters to obtain a desired level of sparsity are provided. In this paper, we propose the use of a genetic algorithm (GA) to select the number of non-zero loadings (NNZL) in each principal component while using SPCA. The proposed approach considerably improves the interpretability of principal components and addresses the difficulty in the selection of NNZL in SPCA. Furthermore, we compare the performance of PCA and SPCA in uncovering the underlying latent structure of the data. The key features of the methodology are assessed through a synthetic example, pitprops data and a comparative study of the benchmark Tennessee Eastman process.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , ,