A scoring criterion for rejection of clustered p-values

Article ID	Journal	Published Year	Pages	File Type
6868819	Computational Statistics & Data Analysis	2018	10 Pages	PDF

Abstract

In dealing with the multiplicity problem of large dataset, clusters or families of hypotheses are often the units of interest. A scoring method is motivated in adopting a rejection space for p-values that are classified into spatial or labeled groups. A score that measures the benefits/costs of making a true/false discovery is computed and rejection space that maximizes the number of rejections with positive score is adopted. Renewal and boundary-crossing theories are used to compute the exceedance probability of the score. Level of strong group type I error control is validated using Monte Carlo and importance sampling methods. It is shown that the scoring method maintains detection power and achieves robustness against model deviation. The scoring method is applied on a copy number variation tumor dataset and short intervals of the chromosome with biological relevance are identified.

Keywords

FDR Sequential analysis Multiple comparison Importance sampling