کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
8941773 1645027 2018 25 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Speeding up the large-scale consensus fuzzy clustering for handling Big Data
ترجمه فارسی عنوان
سرعت بخشیدن به خوشه بندی جامع فازی در مقیاس بزرگ برای دست زدن به داده های بزرگ
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
Massive data can create a real competitive advantage for the companies; it is used to better respond to customers, to follow the behavior of consumers, to anticipate the evolutions, etc. However, it has its own deficiencies. This data volume not only requires big storage spaces but also makes analysis, processing and retrieval operations very difficult and hugely time-consuming. One way to overcome these problems is to cluster this data into a compact format that is still an informative version of the entire data. A lot of clustering algorithms have been proposed. However, their scaling is poor in terms of computation time whenever the size of the data gets larger. In this paper, we make full use of consensus clustering to handle Big Data clustering. We use sampling combined with a split-and-merge strategy to fragment data into small subsets, then basic partitions are locally generated from them using RHadoop's parallel processing MapReduce model and later a consensus tendency is followed to obtain the final result. A scalability analysis is conducted to demonstrate the performance of the proposed clustering models by increasing both the number of computing nodes used and the sample size while satisfying the volume and the velocity dimensions.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Fuzzy Sets and Systems - Volume 348, 1 October 2018, Pages 50-74
نویسندگان
, , ,