کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
535518 870351 2013 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Variance inflation in high dimensional Support Vector Machines
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Variance inflation in high dimensional Support Vector Machines
چکیده انگلیسی


• Variance inflation can lead to decreased performance in SVMs.
• A non-parametric calibration scheme can be implemented for restoration of generalizability.
• Viability was proved on 18 benchmark data sets.

Many important machine learning models, supervised and unsupervised, are based on simple Euclidean distance or orthogonal projection in a high dimensional feature space. When estimating such models from small training sets we face the problem that the span of the training data set input vectors is not the full input space. Hence, when applying the model to future data the model is effectively blind to the missed orthogonal subspace. This can lead to an inflated variance of hidden variables estimated in the training set and when the model is applied to test data we may find that the hidden variables follow a different probability law with less variance. While the problem and basic means to reconstruct and deflate are well understood in unsupervised learning, the case of supervised learning is less well understood. We here investigate the effect of variance inflation in supervised learning including the case of Support Vector Machines (SVMS) and we propose a non-parametric scheme to restore proper generalizability. We illustrate the algorithm and its ability to restore performance on a wide range of benchmark data sets.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 34, Issue 16, 1 December 2013, Pages 2173–2180
نویسندگان
, ,