کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4969680 1449978 2017 33 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Error estimation based on variance analysis of k-fold cross-validation
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Error estimation based on variance analysis of k-fold cross-validation
چکیده انگلیسی
Cross-validation (CV) is often used to estimate the generalization capability of a learning model. The variance of CV error has a considerable impact on the accuracy of CV estimator and the adequacy of the learning model, so it is very important to analyze CV variance. The aim of this paper is to investigate how to improve the accuracy of the error estimation based on variance analysis. We first describe the quantitative relationship between CV variance and its accuracy, which can provide guidance for improving the accuracy by reducing the variance. We then study the relationships between variance and relevant variables including the sample size, the number of folds, and the number of repetitions. These form the basis of theoretical strategies of regulating CV variance. Our classification results can theoretically explain the empirical results of Rodríguez and Kohavi. Finally, we propose a uniform normalized variance which not only measures model accuracy but also is irrelative to fold number. Therefore, it simplifies the selection of fold number in k-fold CV and normalized variance can serve as a stable error measurement for model comparison and selection. We report the results of experiments using 5 supervised learning models and 20 datasets. The results indicate that it is reliable to determine which variance is less before k-fold CV by the proposed theorems, and thus the accuracy of error estimation can be promoted by reducing variance. In so doing, we are more likely to select the best parameter or model.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 69, September 2017, Pages 94-106
نویسندگان
, ,