Article ID Journal Published Year Pages File Type
416722 Computational Statistics & Data Analysis 2006 15 Pages PDF
Abstract

In this paper we propose two new EM-type algorithms for model-based clustering. The first algorithm, Ascent EM, draws its ideas from the Monte Carlo EM algorithm and uses only random subsets from the entire database. Using only a subset rather than the entire database allows for significant computational improvements since many fewer data points need to be evaluated in every iteration. We also argue that one can choose the subsets intelligently by appealing to EMs highly-appreciated likelihood-ascent property. The second algorithm that we propose builds upon Ascent EM and incorporates ideas from evolutionary computation to find the global optimum. Model-based clustering can feature local, sub-optimal solutions which can make it hard to find the global optimum. Our algorithm borrows ideas from the Genetic Algorithm (GA) by incorporating the concepts of crossover, mutation and selection into EMs updating scheme. We call this new algorithm the GA Ascent EM algorithm. We investigate the performance of these two algorithms in a functional database of online auction price-curves gathered from eBay.com.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
,