Article ID Journal Published Year Pages File Type
535836 Pattern Recognition Letters 2012 6 Pages PDF
Abstract

Partial least squares (PLS) has been widely applied to process scientific data sets as an effective dimension reduction technique. The main way to determine the number of dimensions extracted by PLS is by using the cross validation method, but its computation load is heavy. Researchers presented fixing the number at three, but intuitively it’s not suitable for all data sets. Based on the intrinsic connection between PLS and the structure of data sets, two novel algorithms are proposed to determine the number of extracted principal components, keeping the valuable information while excluding the trivial. With the merits of variety with different data sets and easy implementation, both algorithms exhibit better performance than the previous works on nine real world data sets.

► Model selection for partial least squares (PLS) based dimension reduction is studied. ► The main way is cross validation, but its computation load is heavy. ► Fixing the number of principal components at three is not proper for all data sets. ► Two novel algorithms are proposed based on the intrinsic structure of PLS. ► They exhibit better performance than the previous works on nine scientific data sets.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,