Robust Bayesian mixture modelling

Article ID	Journal	Published Year	Pages	File Type
9653570	Neurocomputing	2005	18 Pages	PDF

Abstract

Bayesian approaches to density estimation and clustering using mixture distributions allow the automatic determination of the number of components in the mixture. Previous treatments have focussed on mixtures having Gaussian components, but these are well known to be sensitive to outliers, which can lead to excessive sensitivity to small numbers of data points and consequent over-estimates of the number of components. In this paper we develop a Bayesian approach to mixture modelling based on Student-t distributions, which are heavier tailed than Gaussians and hence more robust. By expressing the Student-t distribution as a marginalization over additional latent variables we are able to derive a tractable variational inference algorithm for this model, which includes Gaussian mixtures as a special case. Results on a variety of real data sets demonstrate the improved robustness of our approach.

Keywords

Variational Inference Model selection Student-t distribution Outliers Latent variable model