کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
391733 661934 2016 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets
ترجمه فارسی عنوان
هرس مبتنی بر سفارشات برای بهبود عملکرد مجموعه های طبقه بندی ها در چارچوب مجموعه های عدم توازن
کلمات کلیدی
مجموعه داده های نامتعادل، گروه های مبتنی بر درخت، هرس بر اساس سفارش، بسته بندی تقویت
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• The use of ordering based pruning approaches for ensemble learning in imbalanced classification is proposed.
• Standard pruning schemes have been adapted to the framework of imbalanced data.
• BB-Imb and RE-GMmetrics allow a significant gain in the studied models, allowing baseline methodologies to be outperformed.
• The Boosting Based Imbalanced approach in conjunction with UnderBagging has excelled as the best option.
• Conclusions are supported by a thorough experimental study with 66 datasets.

The scenario of classification with imbalanced datasets has gained a notorious significance in the last years. This is due to the fact that a large number of problems where classes are highly skewed may be found, affecting the global performance of the system. A great number of approaches have been developed to address this problem. These techniques have been traditionally proposed under three different perspectives: data treatment, adaptation of algorithms, and cost-sensitive learning.Ensemble-based models for classifiers are an extension over the former solutions. They consider a pool of classifiers, and they can in turn integrate any of these proposals. The quality and performance of this type of methodology over baseline solutions have been shown in several studies of the specialized literature.The goal of this work is to improve the capabilities of tree-based ensemble-based solutions that were specifically designed for imbalanced classification, focusing on the best behaving bagging- and boosting-based ensembles in this scenario. In order to do so, this paper proposes several new metrics for ordering-based pruning, which are properly adapted to address the skewed-class distribution. From our experimental study we show two main results: on the one hand, the use of the new metrics allows pruning to become a very successful approach in this scenario; on the other hand, the behavior of Under-Bagging model excels, achieving the highest gain with the usage of pruning, since the random undersampled sets that best complement each other can be selected. Accordingly, this scheme is capable of outperforming previous ensemble models selected from the state-of-the-art.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 354, 1 August 2016, Pages 178–196
نویسندگان
, , , , ,