کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
392556 664777 2014 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines
ترجمه فارسی عنوان
انتخاب ویژگی برای مجموعه داده های طبقه بندی نشده طبقه با استفاده از دستگاه های بردار پشتیبانی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Feature selection and classification of imbalanced data sets are two of the most interesting machine learning challenges, attracting a growing attention from both, industry and academia. Feature selection addresses the dimensionality reduction problem by determining a subset of available features to build a good model for classification or prediction, while the class-imbalance problem arises when the class distribution is too skewed. Both issues have been independently studied in the literature, and a plethora of methods to address high dimensionality as well as class-imbalance has been proposed. The aim of this work is to simultaneously explore both issues, proposing a family of methods that select those attributes that are relevant for the identification of the target class in binary classification. We propose a backward elimination approach based on successive holdout steps, whose contribution measure is based on a balanced loss function obtained on an independent subset. Our experiments are based on six highly imbalanced microarray data sets, comparing our methods with well-known feature selection techniques, and obtaining a better prediction with consistently fewer relevant features.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 286, 1 December 2014, Pages 228–246
نویسندگان
, , ,