کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
483482 701411 2016 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Rule extraction using Recursive-Rule extraction algorithm with J48graft combined with sampling selection techniques for the diagnosis of type 2 diabetes mellitus in the Pima Indian dataset
ترجمه فارسی عنوان
کشف قاعده با استفاده از الگوریتم استخراج قاعده بازگشتی با J48graft همراه با تکنیک های انتخاب نمونه برای تشخیص دیابت نوع 2 در مجموعه داده پیما هند
کلمات کلیدی
قانون استخراج؛ دیابت نوع 2؛ الگوریتم RX دوباره ؛ نمونه برداری انتخاب؛ دیابت هند پیما؛ داده کاوی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی


• Proposed novel rule extraction algorithm, Re-RX with J48graft combined with sampling selection.
• Achieve highly accurate, concise and interpretable classification rule for the Pima Indian dataset.
• Obtained a peak accuracy of 86.09% after 10 runs of 10-fold cross validation for the Pima Indian.
• Present algorithm provided considerably fewer number of rules than the original Re-RX algorithm.
• Sampling Re-RX with J48graft is more suitable for medical decision making for T2DM.

Diabetes is a complex disease that is increasing in prevalence around the world. Type 2 diabetes mellitus (T2DM) accounts for about 90–95% of all diagnosed adult cases of diabetes. Most present diagnostic methods for T2DM are black-box models, which are unable to provide the reasons underlying diagnosis to physicians; therefore, algorithms that can provide further insight are needed. Rule extraction can provide such explanations; however, in the medical setting, extracted rules must be not only highly accurate, but also simple and easy to understand. The Recursive-Rule eXtraction (Re-RX) algorithm is a “white-box” model that provides highly accurate classification. However, due to its recursive nature, it tends to generate more rules than other algorithms. Therefore, in this study, we propose the use of a rule extraction algorithm, Re-RX with J48graft, combined with sampling selection techniques (sampling Re-RX with J48graft) to achieve highly accurate, concise, and interpretable classification rules for the Pima Indian Diabetes (PID) dataset, which comprises 768 samples with two classes (diabetes or non-diabetes) and eight continuous attributes. The use of this algorithm resulted in an average accuracy of 83.83% after 10 runs of 10-fold cross validation. Sampling Re-RX with J48 graft achieved substantially better accuracy and provided a considerably fewer average number of rules and antecedents than the original Re-RX algorithm. These results suggest that sampling Re-RX with J48graft provides more accurate, concise, and interpretable extracted rules than previous algorithms, and is therefore more suitable for medical decision making, including the diagnosis of T2DM.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Informatics in Medicine Unlocked - Volume 2, 2016, Pages 92–104
نویسندگان
, ,