کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
406459 | 678086 | 2014 | 11 صفحه PDF | دانلود رایگان |
• A resampling ensemble algorithm is developed for imbalance classification problems.
• The scales of oversampling and undersampling are empirically analyzed.
• Experiment results show that the proposed method could improve performance greatly.
• Algorithm performance is related to the ratio of data size and attribute number.
In this paper, a resampling ensemble algorithm is developed focused on the classification problems for imbalanced datasets. In the method, the small classes are oversampled and large classes are undersampled. The resampling scale is determined by the ratio of the min class number and max class number. And multiple machine learning methods are selected to construct the ensemble. Numerical results show that the algorithm performance is highly related to the ratio of minority class number and attribute number. When the ratio is less than 3, the performance will be greatly hindered. Experimental results also show that the ensemble of different types of methods could improve the algorithm performance efficiently.
Journal: Neurocomputing - Volume 143, 2 November 2014, Pages 57–67