کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
495764 862837 2014 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Cost-sensitive decision tree ensembles for effective imbalanced classification
ترجمه فارسی عنوان
درخت تصمیمی حساس به هزینه برای طبقه بندی نامطلوب موثر است
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• A novel cost-sensitive ensemble, based on decision trees, for imbalanced classification.
• An evolutionary-based simultaneous classifier selection and fusion boost the recognition rate of the minority class.
• An analysis of the influence of the cost matrix parameters and data imbalance ratio on the performance of the ensemble.
• A ROC-based tuning method of the ensemble parameters.

Real-life datasets are often imbalanced, that is, there are significantly more training samples available for some classes than for others, and consequently the conventional aim of reducing overall classification accuracy is not appropriate when dealing with such problems. Various approaches have been introduced in the literature to deal with imbalanced datasets, and are typically based on oversampling, undersampling or cost-sensitive classification. In this paper, we introduce an effective ensemble of cost-sensitive decision trees for imbalanced classification. Base classifiers are constructed according to a given cost matrix, but are trained on random feature subspaces to ensure sufficient diversity of the ensemble members. We employ an evolutionary algorithm for simultaneous classifier selection and assignment of committee member weights for the fusion process. Our proposed algorithm is evaluated on a variety of benchmark datasets, and is confirmed to lead to improved recognition of the minority class, to be capable of outperforming other state-of-the-art algorithms, and hence to represent a useful and effective approach for dealing with imbalanced datasets.

An example of improvement of minority class recognition using a cost-sensitive decision tree for a toy problem. Figure optionsDownload as PowerPoint slide

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Soft Computing - Volume 14, Part C, January 2014, Pages 554–562
نویسندگان
, , ,