کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
495125 862816 2015 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
The use of vicinal-risk minimization for training decision trees
ترجمه فارسی عنوان
استفاده از به حداقل رساندن خطر نزولی برای درخت تصمیم گیری آموزشی
کلمات کلیدی
درختان تصمیم گیری، به حداقل رساندن خطر، درختان تصمیم گیری، طبقه بندی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• We propose the use of Vapnik's vicinal risk minimization (VRM) for training decision trees to approximately maximize decision margins.
• We implement VRM by propagating uncertainties in the input attributes into the labeling decisions. We perform a global regularization over the decision tree structure.
• During a training phase, a decision tree is constructed to minimize the total probability of misclassifying the labeled training examples, a process which approximately maximizes the margins of the resulting classifier.
• We perform the necessary minimization using genetic programming and present results over a range of synthetic and benchmark real datasets.
• We demonstrate the statistical superiority of VRM training over conventional empirical risk minimization (ERM) and the well-known C4.5 algorithm, for a range of synthetic and real datasets. We also conclude that there is no statistical difference between trees trained by ERM and using C4.5. Training with VRM is shown to be more stable and repeatable than by ERM.

We propose the use of Vapnik's vicinal risk minimization (VRM) for training decision trees to approximately maximize decision margins. We implement VRM by propagating uncertainties in the input attributes into the labeling decisions. In this way, we perform a global regularization over the decision tree structure. During a training phase, a decision tree is constructed to minimize the total probability of misclassifying the labeled training examples, a process which approximately maximizes the margins of the resulting classifier. We perform the necessary minimization using an appropriate meta-heuristic (genetic programming) and present results over a range of synthetic and benchmark real datasets. We demonstrate the statistical superiority of VRM training over conventional empirical risk minimization (ERM) and the well-known C4.5 algorithm, for a range of synthetic and real datasets. We also conclude that there is no statistical difference between trees trained by ERM and using C4.5. Training with VRM is shown to be more stable and repeatable than by ERM.

Figure optionsDownload as PowerPoint slide

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Soft Computing - Volume 31, June 2015, Pages 185–195
نویسندگان
, ,