کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
382779 660790 2015 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Finding the best diversity generation procedures for mining contrast patterns
ترجمه فارسی عنوان
پیدا کردن بهترین روش های تولید تنوع برای الگوهای کنتراست معدن
کلمات کلیدی
طبقه بندی های قابل درک، الگوهای کنتراست، تنوع گروهی، روشهای تعیین کننده
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• Comparison of diversity generation procedures for mining contrast patterns.
• Diversity calculated based on the amount of total, unique, and minimal patterns.
• Three new deterministic methods for generating diversity in decision trees.
• Study of the influence of data type in diversity and accuracy of methods.
• Random Forest and Bagging are the best procedures.

Most understandable classifiers are based on contrast patterns, which can be accurately mined from decision trees. Nevertheless, tree diversity must be ensured to mine a representative pattern collection. In this paper, we performed an experimental comparison among different diversity generation procedures. We compare diversity generated by each procedure based on the amount of total, unique, and minimal patterns extracted from the induced tree for different minimal support thresholds. This comparison, together with an accuracy and abstention experiment, shows that Random Forest and Bagging generate the most diverse and accurate pattern collection. Additionally, we study the influence of data type in the results, finding that Random Forest is best for categorical data and Bagging for numerical data. Comparison includes most known diversity generation procedures and three new deterministic procedures introduced here. These deterministic procedures outperform existing deterministic method, but are still outperformed by random procedures.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 42, Issue 11, 1 July 2015, Pages 4859–4866
نویسندگان
, , ,