Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
382422 | Expert Systems with Applications | 2015 | 16 Pages |
•We propose a new entropy based multithreshold linear classifier with an adaptive kernel density estimation.•Proposed classifier maximizes multiple margins, while being conceptually similar in nature to SVM.•This method gives good classification results and is especially designed for unbalanced datasets.•It achieves significantly better results than SVM as part of an expert system designed for drug discovery.•Resulting model provides insight into the internal data geometry and can detect multiple clusters.
This paper proposes a new multithreshold linear classifier (MELC) based on the Renyi’s quadratic entropy and Cauchy–Schwarz divergence, combined with the adaptive kernel density estimation in the one dimensional projections space. Due to its nature MELC is especially well adapted to deal with unbalanced data. As the consequence of both used model and the applied density regularization technique, it shows strong regularization properties and therefore is almost unable to overfit. Moreover, contrary to SVM, in its basic form it has no free parameters, however, at the cost of being a non-convex optimization problem which results in the existence of local optima and the possible need for multiple initializations.In practice, MELC obtained similar or higher scores than the ones given by SVM on both synthetic and real data from the UCI repository. We also perform experimental evaluation of proposed method as a part of expert system designed for drug discovery problem. It appears that not only MELC achieves better results than SVM but also gives some additional insights into data structure, resulting in more complex decision support system.