Article ID Journal Published Year Pages File Type
1145253 Journal of Multivariate Analysis 2016 13 Pages PDF
Abstract

Fast emerging high-throughput technology advances scientific applications into a new era by enabling detection of information-bearing signals with unprecedented sizes. Despite its potential, the analysis of ultrahigh-dimensional data involves fundamental challenges, wherein the deluge of a large amount of irrelevant data can easily obscure the true signals. Classical statistical methods for low to moderate-dimensional data focus on identifying strong true signals using false positive control criteria. These methods, however, have limited power for identifying weak true signals embedded in an extremely large amount of noise. This paper seeks to facilitate the detection of weak signals by introducing a new approach based on false negative instead of false positive control. As a result, a high proportion of weak signals can be retained for follow-up study. The new procedure is completely data-driven and fast in computation. We show in theory its efficiency and adaptivity to the unknown features of the data including signal intensity and sparsity. Simulation studies further evaluate the method under various model settings. We apply the new method in a real-data analysis on detecting genomic variants with varying signal intensities.

Related Topics
Physical Sciences and Engineering Mathematics Numerical Analysis
Authors
,