کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
530944 869802 2013 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Clustering and outlier detection using isoperimetric number of trees
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Clustering and outlier detection using isoperimetric number of trees
چکیده انگلیسی


• A new graph-cut based, O(nlogn) time, data clustering algorithm is introduced.
• A precise definition is proposed for the concept of an outlier-set.
• The outlier profile of a data-set is defined and hardness results are proved.
• Approximation algorithms are given to extract clusters and outliers at the same time.
• Comparative performance analysis is presented on benchmarks and handmade examples.

We propose a graph-based data clustering algorithm which is based on exact clustering of a minimum spanning tree in terms of a minimum isoperimetry criteria. We show that our basic clustering algorithm runs in O(nlogn) and with post-processing in almost O(nlogn) (average case) and O(n2)O(n2) (worst case) time where n is the size of the data-set. It is also shown that our generalized graph model, which also allows the use of potentials at vertices, can be used to extract an extra piece of information related to anomalous data patterns and outliers. In this regard, we propose an algorithm that extracts outliers in parallel to data clustering. We also provide a comparative performance analysis of our algorithms with other related ones and we show that they behave quite effectively on hard synthetic data-sets as well as real-world benchmarks.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 46, Issue 12, December 2013, Pages 3371–3382
نویسندگان
, , ,