Inverse random under sampling for class imbalance problem and its application to multi-label classification

Article ID	Journal	Published Year	Pages	File Type
530241	Pattern Recognition	2012	13 Pages	PDF

Abstract

In this paper, a novel inverse random under sampling (IRUS) method is proposed for the class imbalance problem. The main idea is to severely under sample the majority class thus creating a large number of distinct training sets. For each training set we then find a decision boundary which separates the minority class from the majority class. By combining the multiple designs through fusion, we construct a composite boundary between the majority class and the minority class. The proposed methodology is applied on 22 UCI data sets and experimental results indicate a significant increase in performance when compared with many existing class-imbalance learning methods. We also present promising results for multi-label classification, a challenging research problem in many modern applications such as music, text and image categorization.

► Inverse random under sampling method for the class imbalance problem. ► The idea is to maintain a high true positive rate by imbalance inversion. ► And to control the false positive rate by classifier bagging. ► The proposed methodology is evaluated on 22 UCI data sets. ► The method is used to improve the accuracy of multi-label classification.

Keywords

Multi-label classification Class imbalance problem