Article ID Journal Published Year Pages File Type
10151169 Computational Statistics & Data Analysis 2019 24 Pages PDF
Abstract
Robust clustering from incomplete data is an important topic because, in many practical situations, real datasets are heavy-tailed, asymmetric, and/or have arbitrary patterns of missing observations. Flexible methods and algorithms for model-based clustering are presented via mixture of the generalized hyperbolic distributions and its limiting case, the mixture of multivariate skew-t distributions. An analytically feasible EM algorithm is formulated for parameter estimation and imputation of missing values for mixture models employing missing at random mechanisms. The proposed methodologies are investigated through a simulation study with varying proportions of synthetic missing values and illustrated using a real dataset. Comparisons are made with those obtained from the traditional mixture of generalized hyperbolic distribution counterparts by filling in the missing data using the mean imputation method.
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,