کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
484842 703295 2015 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Scalability of Stochastic Gradient Descent based on “Smart” Sampling Techniques
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Scalability of Stochastic Gradient Descent based on “Smart” Sampling Techniques
چکیده انگلیسی

Various appealing ideas have been recently proposed in the statistical literature to scale-up machine learning techniques and solve predictive/inferential problems from “Big Datasets”. Beyond the massively parallelized and distributed approaches exploiting hardware architectures and programming frameworks that have received increasing interest these last few years, several variants of the Stochastic Gradient Descent (SGD) method based on “smart” sampling procedures have been designed for accelerating the model fitting stage. Such techniques exploit either the form of the objective functional or some supposedly available auxiliary information, and have been thoroughly investigated from a theoretical viewpoint. Though attractive, such statistical methods must be also analyzed from a computational perspective, bearing the possible options offered by recent technological advances in mind. It is thus of vital importance to investigate how to implement efficiently these inferential principles in order to achieve the best trade-off between computational time and accuracy. In this paper, we explore the scalability of the SGD techniques introduced in [9,11] from an experimental perspective. Issues related to their implementation on distributed computing platforms such as Apache Spark are also discussed and experimental results based on large-scale real datasets are displayed in order to illustrate the relevance of the promoted approaches.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 53, 2015, Pages 308-315