Asymptotics of cross-validated risk estimation in estimator selection and performance assessment

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
10525770	958224	2005	24 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Asymptotic linearity Performance assessment - ارزیابی عملکرد Cross-validation - اعتبار سنجی متقابل Model selection - انتخاب مدل Density estimation - برآورد تراکم Asymptotic optimality - بهینه بودن همبستگی Quadratic loss function - تابع افت زاویه ای Loss function - تابع افتادن Generalization error - خطای عمومی Risk - خطر Regression - رگرسیون Classification - طبقه بندی confidence interval - فاصله اطمینان Prediction - پیش بینی

موضوعات مرتبط

مهندسی و علوم پایه ریاضیات آمار و احتمال

پیش نمایش صفحه اول مقاله

Asymptotics of cross-validated risk estimation in estimator selection and performance assessment

چکیده انگلیسی

Risk estimation is an important statistical question for the purposes of selecting a good estimator (i.e., model selection) and assessing its performance (i.e., estimating generalization error). This article introduces a general framework for cross-validation and derives distributional properties of cross-validated risk estimators in the context of estimator selection and performance assessment. Arbitrary classes of estimators are considered, including density estimators and predictors for both continuous and polychotomous outcomes. Results are provided for general full data loss functions (e.g., absolute and squared error, indicator, negative log density). A broad definition of cross-validation is used in order to cover leave-one-out cross-validation, V-fold cross-validation, Monte Carlo cross-validation, and bootstrap procedures. For estimator selection, finite sample risk bounds are derived and applied to establish the asymptotic optimality of cross-validation, in the sense that a selector based on a cross-validated risk estimator performs asymptotically as well as an optimal oracle selector based on the risk under the true, unknown data generating distribution. The asymptotic results are derived under the assumption that the size of the validation sets converges to infinity and hence do not cover leave-one-out cross-validation. For performance assessment, cross-validated risk estimators are shown to be consistent and asymptotically linear for the risk under the true data generating distribution and confidence intervals are derived for this unknown risk. Unlike previously published results, the theorems derived in this and our related articles apply to general data generating distributions, loss functions (i.e., parameters), estimators, and cross-validation procedures.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Statistical Methodology - Volume 2, Issue 2, July 2005, Pages 131-154

نویسندگان

Sandrine Dudoit, Mark J. van der Laan,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Asymptotics of cross-validated risk estimation in estimator selection and performance assessment

دسترسی سریع

ارتباط

English Website