Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
7643617 | Microchemical Journal | 2013 | 28 Pages |
Abstract
The successive projections algorithm (SPA) is aimed at selecting a subset of variables with small multi-collinearity and suitable prediction power for use in Multiple Linear Regression (MLR). The resulting SPA-MLR models have advantages in terms of simplicity and ease of interpretation as compared to latent-variable models, such as Partial-Least-Squares (PLS). However, PLS tends to be less sensitive to instrumental noise. The present paper proposes an extension of SPA to combine the noise-reduction properties of PLS with the possibility of discarding non-informative variables in SPA. For this purpose, SPA is modified in order to select intervals of variables for use in PLS. The proposed iSPA-PLS algorithm is evaluated in two case studies involving near-infrared spectrometric analysis of wheat and beer extract samples. As compared to full-spectrum PLS, the resulting iSPA-PLS models exhibited better performance in terms of both cross-validation and external prediction. On the other hand, iSPA-PLS and SPA-MLR presented similar cross-validation performance, but the iSPA-PLS models clearly outperformed SPA-MLR in the external prediction. Such results indicate that iSPA-PLS may be more robust with respect to differences between the external prediction set and the calibration set used in the cross-validation procedure.
Keywords
Related Topics
Physical Sciences and Engineering
Chemistry
Analytical Chemistry
Authors
Adriano de Araújo Gomes, Roberto Kawakami Harrop Galvão, Mário Cesar Ugulino de Araújo, Germano Véras, Edvan Cirino da Silva,