کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
536498 | 870544 | 2011 | 10 صفحه PDF | دانلود رایگان |
The classical notion of clustering is to induce an equivalence class partition on a set of points, each class, being a homogeneous group, is called a cluster. Since it is an equivalence class partition, a point must belong to one and exactly one cluster. However in many applications, data distributions are such that only a subset of the points tends to flock under some distinct clusters while others go random. The present paper introduces an algorithm to find an optimal subset of points (ideally filtering out the random ones) with sufficient grouping tendency. It builds the neighborhood population around every point and picks up top k dense regions with possible reshuffling of points in post-processing. Performance of the algorithm is evaluated with applications onto real and simulated data. Comparative analysis on different quality indices with some other state-of-the-art algorithms establishes effectiveness of the approach.
► Classical clustering algorithms induce equivalence class partition to data by assigning each point into exactly one cluster.
► However, data distribution can be such that only a subset of points flock under distinct groups while others go random.
► This paper introduces a concept, core clustering, with necessary algorithm.
► The algorithm identifies top-k dense regions with filtration of the sparse background distribution.
► Experiments on real and synthetic data demonstrate its effectiveness.
Journal: Pattern Recognition Letters - Volume 32, Issue 13, 1 October 2011, Pages 1554–1563