کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
561059 1451936 2017 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
High-dimensional variable selection in regression and classification with missing data
ترجمه فارسی عنوان
انتخاب متغیر بالا بعدی در رگرسیون و طبقه بندی با داده های از دست رفته
کلمات کلیدی
کمند تطبیقی؛ رگرسیون لجستیک؛ بهبود رتبه پایین. تکمیل ماتریس
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• A fast method for high-dimensional regression and classification with missing data is proposed.
• The proposed method combines matrix completion and adaptive lasso.
• It provides promising empirical results.

Variable selection for high-dimensional data problems, including both regression and classification, has been a subject of intense research activities in recent years. Many promising solutions have been proposed. However, less attention has been given to the case when some of the data are missing. This paper proposes a general approach to high-dimensional variable selection with the presence of missing data when the missing fraction can be relatively large (e.g., 50%). Both regression and classification are considered. The proposed approach iterates between two major steps: the first step uses matrix completion to impute the missing data while the second step applies adaptive lasso to the imputed data to select the significant variables. Methods are provided for choosing all the involved tuning parameters. As fast algorithms and software are widely available for matrix completion and adaptive lasso, the proposed approach is fast and straightforward to implement. Results from numerical experiments and applications to two real data sets are presented to demonstrate the efficiency and effectiveness of the approach.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Signal Processing - Volume 131, February 2017, Pages 1–7
نویسندگان
, ,