A comparative study of the K-means algorithm and the normal mixture model for clustering: Bivariate homoscedastic case

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
1149185	957867	2010	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Mixing proportion EM algorithm - الگوریتم EM k-means algorithm - الگوریتم k-means Clustering - خوشه بندی Data mining - داده‌کاوی elongation - طول عمر Mixture model - مدل مخلوط Misclassification rate - نرخ اشتباه طبقه بندی

موضوعات مرتبط

مهندسی و علوم پایه ریاضیات ریاضیات کاربردی

پیش نمایش صفحه اول مقاله

A comparative study of the K-means algorithm and the normal mixture model for clustering: Bivariate homoscedastic case

چکیده انگلیسی

The K-means algorithm and the normal mixture model method are two common clustering methods. The K-means algorithm is a popular heuristic approach which gives reasonable clustering results if the component clusters are ball-shaped. Currently, there are no analytical results for this algorithm if the component distributions deviate from the ball-shape. This paper analytically studies how the K-means algorithm changes its classification rule as the normal component distributions become more elongated under the homoscedastic assumption and compares this rule with that of the Bayes rule from the mixture model method. We show that the classification rules of both methods are linear, but the slopes of the two classification lines change in the opposite direction as the component distributions become more elongated. The classification performance of the K-means algorithm is then compared to that of the mixture model method via simulation. The comparison, which is limited to two clusters, shows that the K-means algorithm provides poor classification performances consistently as the component distributions become more elongated while the mixture model method can potentially, but not necessarily, take advantage of this change and provide a much better classification performance.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Statistical Planning and Inference - Volume 140, Issue 7, July 2010, Pages 1701–1711

نویسندگان

Dingxi Qiu,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A comparative study of the K-means algorithm and the normal mixture model for clustering: Bivariate homoscedastic case

دسترسی سریع

ارتباط

English Website