کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
415835 681240 2012 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Investigations into refinements of Storey’s method of multiple hypothesis testing minimising the FDR, and its application to test binomial data
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Investigations into refinements of Storey’s method of multiple hypothesis testing minimising the FDR, and its application to test binomial data
چکیده انگلیسی

Storey’s method for multiple hypothesis testing “the Optimal Discovery Procedure” (ODP) minimising the false discovery rate (FDR) and giving p-values and q-values (estimates of FDR) for each test, was extended by iteration to enforce consistency between the p-values of the tests and the binary parameters defining which data points contribute to the fitted null hypothesis. These parameters arise when the null hypothesis has to be estimated from the data. The ODP as previously described, is only optimal for fixed values of these parameters. The extension proposed here requires the introduction of a cut-off parameter for the p-values. Motivated by using this method to analyse a set of pairs of frequencies representing gene expression for a set of genes in two libraries, from which it was desired to select those that are most likely to be not following the null hypothesis that the frequency ratio is a fixed unknown number, this method was tested by analysing many similar simulated datasets. The results showed that the ODP modified by iteration could be improved sometimes greatly by a suitable choice of the cut-off parameter, but varying this parameter alone may not lead to the globally optimal solution because statistical testing based on the binomial distribution is more efficient than using a form of the ODP when the number of non-null hypotheses in the data is small, but the reverse is true when it is large. This may be an effect of using discrete data. Efficiency here is defined in terms of the expected proportion of errors that occur (q-value) when a given proportion of the data is declared “significant” (i.e. the null hypothesis is believed not to hold for them). An improved version of the ODP along these lines is likely to have numerous applications such as in the optimised search for candidate genes that show unusual expression patterns for example when more than two experimental conditions are simultaneously compared and to cases when additional categorical variables or a time series is present in the experimental design.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Statistics & Data Analysis - Volume 56, Issue 12, December 2012, Pages 4381–4398
نویسندگان
,