Article ID Journal Published Year Pages File Type
7643617 Microchemical Journal 2013 28 Pages PDF
Abstract
The successive projections algorithm (SPA) is aimed at selecting a subset of variables with small multi-collinearity and suitable prediction power for use in Multiple Linear Regression (MLR). The resulting SPA-MLR models have advantages in terms of simplicity and ease of interpretation as compared to latent-variable models, such as Partial-Least-Squares (PLS). However, PLS tends to be less sensitive to instrumental noise. The present paper proposes an extension of SPA to combine the noise-reduction properties of PLS with the possibility of discarding non-informative variables in SPA. For this purpose, SPA is modified in order to select intervals of variables for use in PLS. The proposed iSPA-PLS algorithm is evaluated in two case studies involving near-infrared spectrometric analysis of wheat and beer extract samples. As compared to full-spectrum PLS, the resulting iSPA-PLS models exhibited better performance in terms of both cross-validation and external prediction. On the other hand, iSPA-PLS and SPA-MLR presented similar cross-validation performance, but the iSPA-PLS models clearly outperformed SPA-MLR in the external prediction. Such results indicate that iSPA-PLS may be more robust with respect to differences between the external prediction set and the calibration set used in the cross-validation procedure.
Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , , ,