Article ID Journal Published Year Pages File Type
4948135 Neurocomputing 2016 10 Pages PDF
Abstract
A new oversampling method called Cluster-based Weighted Oversampling for Ordinal Regression (CWOS-Ord) is proposed for addressing ordinal regression with imbalanced datasets. Ordinal regression is a supervised approach for learning the ordinal relationship between classes. In many applications, the dataset is highly imbalanced where the instances of some classes (majority classes) occur much more frequently than instances of other classes (minority classes). This significantly degrades the classification performance as classifiers tend to strongly favor the majority classes. Standard oversampling methods can be used to improve the dataset class distribution; however, they do not consider the ordinal relationship between the classes. The proposed CWOS-Ord method aims to address this problem by first clustering minority classes and then oversampling them based on their distances and ordering relationship to other classes' instances. The final size to oversample the clusters depends on their complexity and their initial size so that more synthetic instances are generated for more complex and smaller clusters while fewer instances are generated for less complex and larger clusters. As a secondary contribution, existing oversampling methods for two-class classification have been extended for ordinal regression. Results demonstrate that the proposed CWOS-Ord method provides significantly better results compared to other methods based on the performance measures.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, ,