Validity index for clusters of different sizes and densities

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
534855	870297	2011	14 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

K-means clustering - K به معنی خوشه بندی است Clustering - خوشه بندی Validity index - شاخص معتبر Unsupervised classification - طبقه بندی نامناسب

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش صفحه اول مقاله

Validity index for clusters of different sizes and densities

چکیده انگلیسی

Cluster validity indices are used to validate results of clustering and to find a set of clusters that best fits natural partitions for given data set. Most of the previous validity indices have been considerably dependent on the number of data objects in clusters, on cluster centroids and on average values. They have a tendency to ignore small clusters and clusters with low density. Two cluster validity indices are proposed for efficient validation of partitions containing clusters that widely differ in sizes and densities. The first proposed index exploits a compactness measure and a separation measure, and the second index is based an overlap measure and a separation measure. The compactness and the overlap measures are calculated from few data objects of a cluster while the separation measure uses all data objects. The compactness measure is calculated only from data objects of a cluster that are far enough away from the cluster centroids, while the overlap measure is calculated from data objects that are enough near to one or more other clusters. A good partition is expected to have low degree of overlap and a larger separation distance and compactness. The maximum value of the ratio of compactness to separation and the minimum value of the ratio of overlap to separation indicate the optimal partition. Testing of both proposed indices on some artificial and three well-known real data sets showed the effectiveness and reliability of the proposed indices.

Research highlights
► Many cluster validity indices offers conclusion that there is not generally the best validity index.
► Existing cluster validity indices are not very efficient in estimation of clusters of different sizes and densities.
► Cluster validity indices that are based only on fuzzy membership values are not efficient in validation of clusters of different sizes and densities.
► Validity indices that are not based on average values are efficient in validation of clusters of different sizes and densities.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 32, Issue 2, 15 January 2011, Pages 221–234

نویسندگان

Krista Rizman Žalik, Borut Žalik,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Validity index for clusters of different sizes and densities

دسترسی سریع

ارتباط

English Website