Article ID Journal Published Year Pages File Type
1181009 Chemometrics and Intelligent Laboratory Systems 2013 8 Pages PDF
Abstract

•The MCS-RPLS method was proposed for key wavelengths selection from NIRS.•4 NIR datasets were analyzed with MCS-RPLS and few existed methods were compared.•MCS-RPLS obtained lower RMSECV of 10-fold cross validation than MWPLS.•Monte Carlo strategy was adopted, thus ensured the robustness of the procedure.

Variable selection is a critical step in data analysis for near infrared spectroscopy. Recently, many studies have been reported on variable selection and researchers have proposed a large number of methods to identify variables (wavelengths) that contribute useful information. In the present study, a key wavelengths selection method named Monte Carlo sampling–recursive partial least squares (MCS-RPLS) is proposed. The method mainly includes three steps: (1) Monte Carlo sampling; (2) feature selection for each subset; and (3) determination of the optimum feature set for the dataset. The method has been used for feature selection and multivariate calibration on four near infrared spectroscopic datasets: corn moisture, corn protein, HSA and γ-globulin of biological samples. And the 10-fold cross validation results are compared with those obtained by full spectra-PLS, Moving Window Partial Least Squares (MWPLS), Monte Carlo-based Uninformative Variable Elimination (MC-UVE) and CARS. The results showed that the data dimensionalities and the RMSECV values of the selected variables are greatly reduced, thus the MCS-RPLS is available for feature selection from NIR data.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , ,