کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1149185 957867 2010 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A comparative study of the K-means algorithm and the normal mixture model for clustering: Bivariate homoscedastic case
موضوعات مرتبط
مهندسی و علوم پایه ریاضیات ریاضیات کاربردی
پیش نمایش صفحه اول مقاله
A comparative study of the K-means algorithm and the normal mixture model for clustering: Bivariate homoscedastic case
چکیده انگلیسی

The K-means algorithm and the normal mixture model method are two common clustering methods. The K-means algorithm is a popular heuristic approach which gives reasonable clustering results if the component clusters are ball-shaped. Currently, there are no analytical results for this algorithm if the component distributions deviate from the ball-shape. This paper analytically studies how the K-means algorithm changes its classification rule as the normal component distributions become more elongated under the homoscedastic assumption and compares this rule with that of the Bayes rule from the mixture model method. We show that the classification rules of both methods are linear, but the slopes of the two classification lines change in the opposite direction as the component distributions become more elongated. The classification performance of the K-means algorithm is then compared to that of the mixture model method via simulation. The comparison, which is limited to two clusters, shows that the K-means algorithm provides poor classification performances consistently as the component distributions become more elongated while the mixture model method can potentially, but not necessarily, take advantage of this change and provide a much better classification performance.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Statistical Planning and Inference - Volume 140, Issue 7, July 2010, Pages 1701–1711
نویسندگان
,