کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4951538 1441477 2017 20 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A parallel approximate SS-ELM algorithm based on MapReduce for large-scale datasets
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
A parallel approximate SS-ELM algorithm based on MapReduce for large-scale datasets
چکیده انگلیسی
Extreme Learning Machine (ELM) algorithm not only has gained much attention of many scholars and researchers, but also has been widely applied in recent years especially when dealing with big data because of its better generalization performance and learning speed. The proposal of SS-ELM (semi-supervised Extreme Learning Machine) extends ELM algorithm to the area of semi-supervised learning which is an important issue of machine learning on big data. However, the original SS-ELM algorithm needs to store the data in the memory before processing it, so that it could not handle large and web-scale data sets which are of frequent appearance in the era of big data. To solve this problem, this paper firstly proposes an efficient parallel SS-ELM (PSS-ELM) algorithm on MapReduce model, adopting a series of optimizations to improve its performance. Then, a parallel approximate SS-ELM Algorithm based on MapReduce (PASS-ELM) is proposed. PASS-ELM is based on the approximate adjacent similarity matrix (AASM) algorithm, which leverages the Locality-Sensitive Hashing (LSH) scheme to calculate the approximate adjacent similarity matrix, thus greatly reducing the complexity and occupied memory. The proposed AASM algorithm is general, because the calculation of the adjacent similarity matrix is the key operation in many other machine learning algorithms. The experimental results have demonstrated that the proposed PASS-ELM algorithm can efficiently process very large-scale data sets with a good performance, without significantly impacting the accuracy of the results.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Parallel and Distributed Computing - Volume 108, October 2017, Pages 85-94
نویسندگان
, , , ,