Article ID Journal Published Year Pages File Type
5129412 Journal of Multivariate Analysis 2017 16 Pages PDF
Abstract

Multiple correspondence analysis is a dimension reduction technique which plays a large role in the analysis of tables with categorical nominal variables, such as survey data. Though it is usually motivated and derived using geometric considerations, we prove that in fact, it can be seen as a single proximal Newton step of a natural bilinear exponential family model for categorical data: the multinomial logit bilinear model. We compare and contrast the behavior of multiple correspondence analysis with that of this model on simulated data, and discuss new insights into both approaches and their cognate models. Consequently, multiple correspondence analysis can be used to approximate the parameters of the multilogit model. Indeed, estimating the model’s parameters is non-trivial, whereas multiple correspondence analysis has the advantage of being easily solved by a singular value decomposition, and scalable to large data sets. We illustrate the methods on a survey of the drinking habits in France in the context of European policies against the harmful effects of alcohol on society.

Related Topics
Physical Sciences and Engineering Mathematics Numerical Analysis