Article ID Journal Published Year Pages File Type
397035 International Journal of Approximate Reasoning 2014 11 Pages PDF
Abstract

•A probabilistic inference is developed for large-scale multinomial distributions.•The method improves the Dempster–Shafer theory of belief.•The method has a desirable long-run frequency property.•The inference method is applied in a genome-wide association study.

Statistical analysis of multinomial counts with a large number K of categories and a small number n of sample size is challenging to both frequentist and Bayesian methods and requires thinking about statistical inference at a very fundamental level. Following the framework of Dempster–Shafer theory of belief functions, a probabilistic inferential model is proposed for this “large K and small n” problem. The inferential model produces a probability triplet (p,q,r) for an assertion conditional on observed data. The probabilities p and q are for and against the truth of the assertion, whereas r=1−p−q is the remaining probability called the probability of “donʼt know”. The new inference method is applied in a genome-wide association study with very high dimensional count data, to identify association between genetic variants to the disease Rheumatoid Arthritis.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence