Article ID Journal Published Year Pages File Type
416915 Computational Statistics & Data Analysis 2006 16 Pages PDF
Abstract

We study a nonparametric Bayesian formulation based on the Dirichlet process mixing models for the frequency counts in a two-way contingency table. The formulation provides a model averaging framework for a cluster analysis in the contingency table, because it specifies various partitions of the subjects according to their classification probabilities. We develop a new Gibbs sampler that improves upon the current collapsed Gibbs sampler by blocking and reducing the number of classification probabilities to be updated using the clustering configuration. The performance of the new Gibbs sampler is compared to the existing one. We apply the new method to two data sets; one is on testing for homogeneity in the contingency table, the other is on determining whether or not the residue frequency is independent of the position in a protein binding site from a data set of DNA sequences.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, ,