کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
416614 681388 2007 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Robust learning from bites for data mining
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Robust learning from bites for data mining
چکیده انگلیسی

Some methods from statistical machine learning and from robust statistics have two drawbacks. Firstly, they are computer-intensive such that they can hardly be used for massive data sets, say with millions of data points. Secondly, robust and non-parametric confidence intervals for the predictions according to the fitted models are often unknown. A simple but general method is proposed to overcome these problems in the context of huge data sets. An implementation of the method is scalable to the memory of the computer and can be distributed on several processors to reduce the computation time. The method offers distribution-free confidence intervals for the median of the predictions. The main focus is on general support vector machines (SVM) based on minimizing regularized risks. As an example, a combination of two methods from modern statistical machine learning, i.e. kernel logistic regression and εε-support vector regression, is used to model a data set from several insurance companies. The approach can also be helpful to fit robust estimators in parametric models for huge data sets.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Statistics & Data Analysis - Volume 52, Issue 1, 15 September 2007, Pages 347–361
نویسندگان
, , ,