کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6854273 1437410 2018 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Sparse Self-Represented Network Map: A fast representative-based clustering method for large dataset and data stream
ترجمه فارسی عنوان
نقشه شبکه ای که خودپرداز شده است، یک روش سریع خوشه بندی مبتنی بر نمایه برای مجموعه داده های بزرگ و جریان داده است
کلمات کلیدی
خوشه سریع، انعطاف پذیر مقدار دهی اولیه پویا، تشخیص تصویر،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
The demand of fast clustering increases rapidly as we keep collecting tremendously large amount of data in the last decade. In this paper, we propose a nonparametric and representative-based Sparse Self-Represented Network Map for fast clustering on large dataset. Each node in the network generates a heat map for the dataset by receiving stimulations from data within its Accepting Field. We developed a weight adjusting method to learn and summarize the clustering pattern of the data. Such learned map is used for computing clustering results, by breaking weak links and finding connected components Rather than employing an iterative process to find local minima, our network passes the dataset only once and is able to capture the global pattern of the dataset as well as detecting natural number of clusters. As a nonparametric method, we propose Sparse Dynamic Instantiation to avoid the curse of dimensionality, namely a node or a link is instantiated only when stimulated by input data. As a result, the overall complexity is linear to the data dimension. Our algorithm is tested on synthetic and real datasets and compare with popular clustering algorithms (K-means++, Expectation-Maximization, Mean-Shift and StreamKM++) as well as state-of-art clustering algorithm (Affinity Propagation and Density Peak). We also applied our clustering algorithm to mobile location clustering, building a Visual Dictionary for image recognition, and clustering data streams. Our experiments indicate that our algorithm can be a better alternative for all compared popular clustering algorithms especially when efficiency is the primary consideration, namely we drastically improve time and space complexity but retain equal level of accuracy.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Engineering Applications of Artificial Intelligence - Volume 68, February 2018, Pages 121-130
نویسندگان
, , , ,