Article ID Journal Published Year Pages File Type
1147879 Journal of Statistical Planning and Inference 2013 11 Pages PDF
Abstract

In many conventional scientific investigations with high or ultra-high dimensional feature spaces, the relevant features, though sparse, are large in number compared with classical statistical problems, and the magnitude of their effects tapers off. It is reasonable to model the number of relevant features as a diverging sequence when sample size increases. In this paper, we investigate the properties of the extended Bayes information criterion (EBIC) (Chen and Chen, 2008) for feature selection in linear regression models with diverging number of relevant features in high or ultra-high dimensional feature spaces. The selection consistency of the EBIC in this situation is established. The application of EBIC to feature selection is considered in a SCAD cum EBIC procedure. Simulation studies are conducted to demonstrate the performance of the SCAD cum EBIC procedure in finite sample cases.

Related Topics
Physical Sciences and Engineering Mathematics Applied Mathematics
Authors
, ,