Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
494978 | Applied Soft Computing | 2015 | 16 Pages |
•Formulation of the classification rule mining problem as a multi-objective problem.•Proposal of MOCA-I that deals both with uncertainty, class imbalance and volumetry.•Comparison of different MOCA-I based DMLS versions and DMLS 1· * shows better results.•Comparison with 13 state-of-the-art classification algorithms.•MOCA-I gives shorter and statistically more effective rules than other algorithms.
Classification on medical data raises several problems such as class imbalance, double meaning of missing data, volumetry or need of highly interpretable results. In this paper a new algorithm is proposed: MOCA-I (Multi-Objective Classification Algorithm for Imbalanced data), a multi-objective local search algorithm that is conceived to deal with these issues all together. It is based on a new modelization as a Pittsburgh multi-objective partial classification rule mining problem, which is described in the first part of this paper. An existing dominance-based multi-objective local search (DMLS) is modified to deal with this modelization. After experimentally tuning the parameters of MOCA-I and determining which version of DMLS algorithm is the most effective, the obtained MOCA-I version is compared to several state-of-the-art classification algorithms. This comparison is realized on 10 small and middle-sized data sets of literature and 2 real data sets; MOCA-I obtains the best results on the 10 data sets and is statistically better than other approaches on the real data sets.
Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slide