کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
5132366 | 1491520 | 2016 | 9 صفحه PDF | دانلود رایگان |

- A novel interval selection method named Fisher optimal subspace shrinkage (FOSS) is proposed.
- The intervals are constructed based on the regression coefficients of PLS model.
- The proposed method has better prediction performance compared to its competitors.
Variable selection methods have been widely used for dimension reduction and improved interpretability when analyzing high-dimension data, such as spectral data and gene expression microarray data. An interesting property of the spectral data is that consecutive variables carry similar information. In other words, a spectral variable is naturally akin to another spectral variable with a close wavelength, which indicates that the regression coefficients of consecutive wavelengths have close values. Based on the above fact, a new block variable selection method named Fisher optimal subspace shrinkage (FOSS), is proposed by using the Fisher optimal partitions algorithm. Unlike most of the existing interval selection methods, FOSS uses information from the regression coefficients of partial least squares (PLS) models to adaptively split variables into some intervals that can have unequal-width. Then, these intervals are selected by weighted block bootstrap sampling (WBBS). The weights of sub-intervals are determined by the mean of the absolute values of regression coefficients of the corresponding interval. The FOSS method is useful particularly when the correlations among the variables are high. We illustrate the performance of the proposed method on three near-infrared (NIR) spectroscopy datasets. Five high-performance variable selection methods are used for comparison. Empirical studies on three real-world datasets under different performance metrics show that FOSS compares favorably to its competitors.
Journal: Chemometrics and Intelligent Laboratory Systems - Volume 159, 15 December 2016, Pages 196-204