Article ID Journal Published Year Pages File Type
1180228 Chemometrics and Intelligent Laboratory Systems 2006 7 Pages PDF
Abstract

Monte Carlo cross validation (MCCV) is used in two data sets including 125 and 1643 near-infrared (NIR) spectra of biological samples, respectively, to ascertain the number of samples left out for validation in MCCV and the dimension of PLS models consequently. With the selected number of samples in validation set, the suitable number of latent variables (LV) may be chosen correctly. The results obtained show that root mean squared error of calibration (RMSEC), root mean squared error of cross validation (RMSECV) and LV number are sensitive to the number of samples left out for validation when too many samples are left out. Based on this, RMSEC and RMSECV are suggested as criteria to assist the ascertainment of the number of samples left out for validation in MCCV. This method is easy and convenient to use. For a larger data set, more samples may be left out, but the suitable number of samples left out will decrease if the measurement error level is high.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , , ,