کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
535786 870379 2012 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Iterative random projections for high-dimensional data clustering
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Iterative random projections for high-dimensional data clustering
چکیده انگلیسی

In this text we propose a method which efficiently performs clustering of high-dimensional data. The method builds on random projection and the K-means algorithm. The idea is to apply K-means several times, increasing the dimensionality of the data after each convergence of K-means. We compare the proposed algorithm on four high-dimensional datasets, image, text and two synthetic, with K-means clustering using a single random projection and K-means clustering of the original high-dimensional data. Regarding time we observe that the algorithm reduces drastically the time when compared to K-means on the original high-dimensional data. Regarding mean squared error the proposed method reaches a better solution than clustering using a single random projection. More notably in the experiments performed it also reaches a better solution than clustering on the original high-dimensional data.


► We propose a novel iterative method for the hard-clustering problem suited for high-dimensional data.
► The method builds on K-means and random projections (RP) and relates to simulated annealing (SA) clustering.
► Experiments on an image, a text and two synthetic datasets show good performance regarding both accuracy and time.
► We achieve a lower MSE than using a single RP and than classic K-means.
► The running time can still be faster than classic K-means, unlike SA clustering which is drastically slower.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 33, Issue 13, 1 October 2012, Pages 1749–1755
نویسندگان
, ,