کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6870245 | 681361 | 2014 | 18 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Nonparametric variable selection and classification: The CATCH algorithm
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
The problem of classifying a categorical response Y is considered in a nonparametric framework. The distribution of Y depends on a vector of predictors X, where the coordinates Xj of X may be continuous, discrete, or categorical. An algorithm is constructed to select the variables to be used for classification. For each variable Xj, an importance score sj is computed to measure the strength of association of Xj with Y. The algorithm deletes Xj if sj falls below a certain threshold. It is shown in Monte Carlo simulations that the algorithm has a high probability of only selecting variables associated with Y. Moreover when this variable selection rule is used for dimension reduction prior to applying classification procedures, it improves the performance of these procedures. The approach for computing importance scores is based on root Chi-square type statistics computed for randomly selected regions (tubes) of the sample space. The size and shape of the regions are adjusted iteratively and adaptively using the data to enhance the ability of the importance score to detect local relationships between the response and the predictors. These local scores are then averaged over the tubes to form a global importance score sj for variable Xj. When confounding and spurious associations are issues, the nonparametric importance score for variable Xj is computed conditionally by using tubes to restrict the other variables. This variable selection procedure is called CATCH (Categorical Adaptive Tube Covariate Hunting). Asymptotic properties, including consistency, are established.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Statistics & Data Analysis - Volume 72, April 2014, Pages 158-175
Journal: Computational Statistics & Data Analysis - Volume 72, April 2014, Pages 158-175
نویسندگان
Shijie Tang, Lisha Chen, Kam-Wah Tsui, Kjell Doksum,