Article ID Journal Published Year Pages File Type
406459 Neurocomputing 2014 11 Pages PDF
Abstract

•A resampling ensemble algorithm is developed for imbalance classification problems.•The scales of oversampling and undersampling are empirically analyzed.•Experiment results show that the proposed method could improve performance greatly.•Algorithm performance is related to the ratio of data size and attribute number.

In this paper, a resampling ensemble algorithm is developed focused on the classification problems for imbalanced datasets. In the method, the small classes are oversampled and large classes are undersampled. The resampling scale is determined by the ratio of the min class number and max class number. And multiple machine learning methods are selected to construct the ensemble. Numerical results show that the algorithm performance is highly related to the ratio of minority class number and attribute number. When the ratio is less than 3, the performance will be greatly hindered. Experimental results also show that the ensemble of different types of methods could improve the algorithm performance efficiently.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , , ,