کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1145457 1489667 2015 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Does modeling lead to more accurate classification?: A study of relative efficiency in linear classification
ترجمه فارسی عنوان
آیا مدل سازی به طبقه بندی دقیق تر منجر می شود: مطالعه کارایی نسبی در طبقه بندی خطی
موضوعات مرتبط
مهندسی و علوم پایه ریاضیات آنالیز عددی
چکیده انگلیسی

Classification arises in a wide range of applications. A variety of statistical tools have been developed for learning classification rules from data. Understanding of their relative merits and comparisons help users to choose a proper method in practice. This paper focuses on theoretical comparison of model-based classification methods in statistics with algorithmic methods in machine learning in terms of the error rate. Extending Efron’s comparison of logistic regression with linear discriminant analysis (LDA) under the normal setting, we contrast such algorithmic methods as the support vector machine (SVM) and boosting with the LDA and logistic regression and study their relative efficiencies in reducing the error rate based on the limiting behavior of the classification boundary of each method. We show that algorithmic methods are generally less effective than model-based methods in the normal setting. In particular, loss of efficiency in error rate is typically about 33% to 60% for the SVM and 50% to 80% for boosting when compared to the LDA. However, a smooth variant of the SVM is shown to be even more efficient than logistic regression. In addition to the theoretical study, we present results from numerical experiments under various settings for comparisons of finite-sample performance and robustness to mislabeling and model misspecification.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Multivariate Analysis - Volume 133, January 2015, Pages 232–250
نویسندگان
, ,