Article ID Journal Published Year Pages File Type
1150285 Journal of Statistical Planning and Inference 2006 16 Pages PDF
Abstract

This paper connects consistent variable selection with multiple hypotheses testing procedures in the linear regression model Y=Xβ+ɛY=Xβ+ɛ, where the dimension pp of the parameter ββ is allowed to grow with the sample size nn. We view the variable selection problem as one of estimating the index set I0⊆{1,…,p}I0⊆{1,…,p} of the non-zero components of β∈Rpβ∈Rp. Estimation of I0I0 can be further reformulated in terms of testing the hypotheses β1=0,…,βp=0β1=0,…,βp=0. We study here testing via the false discovery rate (FDR) and Bonferroni methods. We show that the set I^⊆{1,…,p} consisting of the indices of rejected hypotheses βi=0βi=0 is a consistent estimator of I0I0, under appropriate conditions on the design matrix XX and the control values used in either procedure. This technique can handle situations where pp is large at a very low computational cost, as no exhaustive search over the space of the 2p2p submodels is required.

Related Topics
Physical Sciences and Engineering Mathematics Applied Mathematics
Authors
, , ,