Article ID Journal Published Year Pages File Type
1180758 Chemometrics and Intelligent Laboratory Systems 2013 13 Pages PDF
Abstract

The objective of this study was to compare two different techniques of variable selection, Sparse PLSR and Jack-knife PLSR, with respect to their predictive ability and their ability to identify relevant variables. Sparse PLSR is a method that is frequently used in genomics, whereas Jack-knife PLSR is often used by chemometricians. In order to evaluate the predictive ability of both methods, cross model validation was implemented. The performance of both methods was assessed using FTIR spectroscopic data, on the one hand, and a set of simulated data. The stability of the variable selection procedures was highlighted by the frequency of the selection of each variable in the cross model validation segments. Computationally, Jack-knife PLSR was much faster than Sparse PLSR. But while it was found that both methods have more or less the same predictive ability, Sparse PLSR turned out to be generally very stable in selecting the relevant variables, whereas Jack-knife PLSR was very prone to selecting also uninformative variables. To remedy this drawback, a strategy of analysis consisting in adding a perturbation parameter to the uncertainty variances obtained by means of Jack-knife PLSR is demonstrated.

► We compared two variable selection methods: Sparse PLSR and Jack-knife PLSR. ► Sparse PLSR outperforms Jack-knife PLSR, especially when the noise level is moderate. ► The variable selection of Sparse PLSR is stable. ► The variable selection of Jack-knife PLSR can be improved by a perturbation parameter. ► Jack-knife PLSR is less time-consuming.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , , , ,