Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
483482 | Informatics in Medicine Unlocked | 2016 | 13 Pages |
•Proposed novel rule extraction algorithm, Re-RX with J48graft combined with sampling selection.•Achieve highly accurate, concise and interpretable classification rule for the Pima Indian dataset.•Obtained a peak accuracy of 86.09% after 10 runs of 10-fold cross validation for the Pima Indian.•Present algorithm provided considerably fewer number of rules than the original Re-RX algorithm.•Sampling Re-RX with J48graft is more suitable for medical decision making for T2DM.
Diabetes is a complex disease that is increasing in prevalence around the world. Type 2 diabetes mellitus (T2DM) accounts for about 90–95% of all diagnosed adult cases of diabetes. Most present diagnostic methods for T2DM are black-box models, which are unable to provide the reasons underlying diagnosis to physicians; therefore, algorithms that can provide further insight are needed. Rule extraction can provide such explanations; however, in the medical setting, extracted rules must be not only highly accurate, but also simple and easy to understand. The Recursive-Rule eXtraction (Re-RX) algorithm is a “white-box” model that provides highly accurate classification. However, due to its recursive nature, it tends to generate more rules than other algorithms. Therefore, in this study, we propose the use of a rule extraction algorithm, Re-RX with J48graft, combined with sampling selection techniques (sampling Re-RX with J48graft) to achieve highly accurate, concise, and interpretable classification rules for the Pima Indian Diabetes (PID) dataset, which comprises 768 samples with two classes (diabetes or non-diabetes) and eight continuous attributes. The use of this algorithm resulted in an average accuracy of 83.83% after 10 runs of 10-fold cross validation. Sampling Re-RX with J48 graft achieved substantially better accuracy and provided a considerably fewer average number of rules and antecedents than the original Re-RX algorithm. These results suggest that sampling Re-RX with J48graft provides more accurate, concise, and interpretable extracted rules than previous algorithms, and is therefore more suitable for medical decision making, including the diagnosis of T2DM.