کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4944555 1437999 2017 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Using the stability of objects to determine the number of clusters in datasets
ترجمه فارسی عنوان
استفاده از ثبات اشیاء برای تعیین تعداد خوشه ها در مجموعه داده ها
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
We introduce a novel method for assessing the robustness of clusters found by partitioning algorithms. First, we show how the stability of individual objects can be estimated based on repeated runs of the K-means and K-medoids algorithms. The quality of the resulting clusterings, expressed by the popular Calinski-Harabasz, Silhouette, Dunn and Davies-Bouldin cluster validity indices, is taken into account when computing the stability estimates of individual objects. Second, we explain how to assess the stability of individual clusters of objects and sets of clusters that are found by partitioning algorithms. Finally, we present a new and effective stability-based algorithm that improves the ability of traditional partitioning methods to determine the number of clusters in datasets. We compare our algorithm to some well-known cluster identification techniques, including X-means, Pvclust, Adegenet, Prediction Strength and Nselectboot. Our experiments with synthetic and benchmark data demonstrate the effectiveness of the proposed algorithm in different practical situations. The R package ClusterStability has been developed to provide applied researchers with new stability estimation tools presented in this paper. It is freely distributed through the Comprehensive R Archive Network (CRAN) and available at: https://cran.r-project.org/web/packages/ClusterStability.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 393, July 2017, Pages 29-46
نویسندگان
, , , ,