کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4947121 | 1439566 | 2017 | 26 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Heterogeneous blocked CPU-GPU accelerate scheme for large scale extreme learning machine
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
Extreme learning machine (ELM) has been intensively studied during the last decade due to its high efficiency, effectiveness and easy to implement. Recently, a variant of ELM named local receptive fields based ELM (ELM-LRF) has been proposed to reduce the global connections and introduce local receptive fields to the input layer. However, an ELM-LRF model with large number of hidden neurons spend plenty of time on solving large scale Moore-Penrose Matrix Inversion (MPMI) problem which has heavy computational cost and needs much more runtime memory. Moreover, this procedure can not be directly accelerated by GPU platforms due to the limited memory of GPU devices. In this paper, we propose three efficient approaches to perform ELM-LRF on GPU platform. First we propose a novel blocked LU decomposition algorithm, which overcomes the limitation of global memory size so that any size of ELM-LRF models can be trained. Furthermore, an efficient blocked Cholesky decomposition algorithm is presented to accelerate blocked LU decomposition algorithm according to matrix characteristics in the ELM-LRF model. Finally we present a heterogeneous blocked CPU-GPU parallel algorithm to fully exploit resources on a GPU node such as to accelerate blocked Cholesky decomposition algorithm furthermore in the ELM-LRF model.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 261, 25 October 2017, Pages 153-163
Journal: Neurocomputing - Volume 261, 25 October 2017, Pages 153-163
نویسندگان
Shijie Li, Xin Niu, Yong Dou, Qi Lv, Yueqing Wang,