کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
382344 660757 2016 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
DENDIS: A new density-based sampling for clustering algorithm
ترجمه فارسی عنوان
DENDIS: یک نمونه جدید مبتنی بر تراکم برای الگوریتم خوشه بندی
کلمات کلیدی
تراکم؛ فاصله؛ پوشش فضایی؛ خوشه بندی؛ شاخص رند
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• ⟨⟨DENDIS⟩⟩ is an hybrid algorithm having the goal to do sampling for clustering.
• It manages both density and distance concepts.
• It is driven by a unique, and meaningful, parameter called granularity.
• It is accurate, fast and parsimonious thanks to internal optimizations.

To deal with large datasets, sampling can be used as a preprocessing step for clustering. In this paper, an hybrid sampling algorithm is proposed. It is density-based while managing distance concepts to ensure space coverage and fit cluster shapes. At each step a new item is added to the sample: it is chosen as the furthest from the representative in the most important group. A constraint on the hyper volume induced by the samples avoids over sampling in high density areas. The inner structure allows for internal optimization: only a few distances have to be computed. The algorithm behavior is investigated using synthetic and real-world data sets and compared to alternative approaches, at conceptual and empirical levels. The numerical experiments proved it is more parsimonious, faster and more accurate, according to the Rand Index, with both k-means and hierarchical clustering algorithms.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 56, 1 September 2016, Pages 349–359
نویسندگان
, ,