Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
496073 | Applied Soft Computing | 2013 | 19 Pages |
In this paper, a new approach based on Differential Evolution (DE) for the automatic classification of items in medical databases is proposed. Based on it, a tool called DEREx is presented, which automatically extracts explicit knowledge from the database under the form of IF-THEN rules containing AND-connected clauses on the database variables. Each DE individual codes for a set of rules. For each class more than one rule can be contained in the individual, and these rules can be seen as logically connected in OR. Furthermore, all the classifying rules for all the classes are found all at once in one step. DEREx is thought as a useful support to decision making whenever explanations on why an item is assigned to a given class should be provided, as it is the case for diagnosis in the medical domain. The major contribution of this paper is that DEREx is the first classification tool in literature that is based on DE and automatically extracts sets of IF-THEN rules without the intervention of any other mechanism. In fact, all other classification tools based on DE existing in literature either simply find centroids for the classes rather than extracting rules, or are hybrid systems in which DE simply optimizes some parameters whereas the classification capabilities are provided by other mechanisms. For the experiments eight databases from the medical domain have been considered. First, among ten classical DE variants, the most effective of them in terms of highest classification accuracy in a ten-fold cross-validation has been found. Secondly, the tool has been compared over the same eight databases against a set of fifteen classifiers widely used in literature. The results have proven the effectiveness of the proposed approach, since DEREx turns out to be the best performing tool in terms of highest classification accuracy. Also statistical analysis has confirmed that DEREx is the best classifier. When compared to the other rule-based classification tools here used, DEREx needs the lowest average number of rules to face a problem, and the average number of clauses per rule is not very high. In conclusion, the tool here presented is preferable to the other classifiers because it shows good classification accuracy, automatically extracts knowledge, and provides users with it under an easily comprehensible form.
Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slideHighlights► An algorithm based on Differential Evolution only is used to perform as a Decision Support System in the medical domain. ► A set of eight medical databases is taken into account. ► Its performance is compared against a set of 15 well-known classifiers in terms of correct classification rate. ► Explicit IF-THEN rules are achieved and shown for each database. ► The results show the effectiveness of the proposed DE-based algorithm with respect to the other classifiers.