Mixed-norm partial least squares

Article ID	Journal	Published Year	Pages	File Type
1179239	Chemometrics and Intelligent Laboratory Systems	2016	12 Pages	PDF

Abstract

•We have proposed a new sparse PLS model, i.e. mixed-norm PLS (MNPLS), for regression modeling.•The ℓ2,1-norm of direction matrix is employed to select common variables.•We have developed a solution to proposed MNPLS.•Convergence analysis is conducted in this paper.

The partial least squares (PLS) method is designed for prediction problems when the number of predictors is larger than the number of training samples. PLS is based on latent components which are linear combinations of the original predictors, it automatically employs all predictors regardless of their relevance. This strategy will potentially degrade its performance, and make the obtained coefficients lack interpretability. Then, several sparse PLS (SPLS) methods are proposed to simultaneously conduct prediction and variable selection via sparsely combining original predictors. However, if information bleed across different components, common variables shared by these components should be selected with successive loadings. To address this issue, we propose a new SPLS model — mixed-norm PLS (MNPLS) — to select common variables during each deflation in this paper. More specifically, we introduced the ℓ2,1 norm to the direction matrix and then developed the corresponding solution to MNPLS. Moreover, we also conducted convergence analysis to mathematically support the proposed MNPLS. Experiments are conducted on four real datasets, experimental results verified our theoretical analysis, and also demonstrated that our MNPLS method can generally outperform the standard PLS and other existing methods in variable selection and prediction.

Keywords

Variable selection Regression analysis Modeling Prediction