کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
403525 677260 2015 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Support vector machine-based optimized decision threshold adjustment strategy for classifying imbalanced data
ترجمه فارسی عنوان
استراتژی تنظیم آستانه تصمیم گیری بر مبنای بردار پشتیبانی برای طبقه بندی داده های عدم تعادل
کلمات کلیدی
عدم تعادل کلاس، ماشین بردار پشتیبانی، تنظیم آستانه تصمیم گیری، جستجو بهینه سازی یادگیری گروهی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• We analyze the reason why SVM can be damaged by class imbalance in theory.
• We propose SVM-OTHR algorithm to find the optimal moving distance of hyperplane.
• We integrate SVM-OTHR into Bagging ensemble framework to promote its robustness.
• The time complexity of SVM-OTHR is merely a little higher than standard SVM.
• Two proposed algorithms often outperform some other bias correction algorithms.

Class imbalance problem occurs when the number of training instances belonging to different classes are clearly different. In this scenario, many traditional classifiers often fail to provide excellent enough classification performance, i.e., the accuracy of the majority class is usually much higher than that of the minority class. In this article, we consider to deal with class imbalance problem by utilizing support vector machine (SVM) classifier with an optimized decision threshold adjustment strategy (SVM-OTHR), which answers a puzzled question: how far the classification hyperplane should be moved towards the majority class? Specifically, the proposed strategy is self-adapting and can find the optimal moving distance of the classification hyperplane according to the real distributions of training samples. Furthermore, we also extend the strategy to develop an ensemble version (EnSVM-OTHR) that can further improve the classification performance. Two proposed algorithms are both compared with many state-of-the-art classifiers on 30 skewed data sets acquired from Keel data set Repository by using two popular class imbalance evaluation metrics: F-measure and G-mean. The statistical results of the experiments indicate their superiority.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 76, March 2015, Pages 67–78
نویسندگان
, , , , , ,