Article ID Journal Published Year Pages File Type
6868658 Computational Statistics & Data Analysis 2018 14 Pages PDF
Abstract
Multivariate categorical data are common in many fields. An illustrative example is provided by election polls studies assessing evidence of changes in voters' opinions with their candidates preferences in the 2016 United States Presidential primaries or caucuses. Similar goals arise in routine applications, but current literature lacks a general methodology which combines flexibility, efficiency, and tractability in testing for group differences in multivariate categorical data at different - potentially complex - scales. This contribution addresses such goal by leveraging a Bayesian representation, which factorizes the joint probability mass function for the group variable and the multivariate categorical data as the product of the marginal probabilities for the groups and the conditional probability mass function of the multivariate categorical data, given the group membership. To enhance flexibility, the conditional probability mass function of the multivariate categorical data is defined via a group-dependent mixture of tensor factorizations which facilitates dimensionality reduction and borrowing of information, while providing tractable procedures for computation, and accurate tests assessing global and local group differences. The proposed methods are compared with popular competitors, and the improved performance is outlined in simulations and in American election polls studies.
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,