Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6861288 | Knowledge-Based Systems | 2018 | 25 Pages |
Abstract
Datasets that have skewed class distributions pose a difficulty to learning algorithms in pattern classification. A number of different methods to deal with this problem have been developed in recent years. Specifically, synthetic oversampling techniques focus on balancing the distribution between the training instances of the majority and minority classes by generating extra artificial minority class instances. Unfortunately, few of them can be spread to tackle the problem of imbalanced data with missing values. Moreover, in most cases, existing oversampling methods do not make full use of the correlation between attributes. To this end, in this paper, we propose a fuzzy rule-based oversampling technique (FRO) to handle the class imbalance problem. FRO firstly creates fuzzy rules from the training data and assigns each of them a rule weight, which represents the certainty degree of an instance belonging to the fuzzy subspace. Then it synthesizes new minority instances under the guidance of fuzzy rules. The number of minority instances to be generated under a given fuzzy rule is determined by the rule weight. In a similar way, FRO can also recover the missing values that exist in the imbalanced dataset. Extensive experiments using 55 real-world imbalanced datasets evaluate the performance of the proposed FRO technique. The results show that our method is better than or comparable with a set of alternative state-of-the-art imbalanced classification algorithms in terms of various assessment metrics.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Gencheng Liu, Youlong Yang, Benchong Li,