کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
386249 660881 2014 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Credal-C4.5: Decision tree based on imprecise probabilities to classify noisy data
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Credal-C4.5: Decision tree based on imprecise probabilities to classify noisy data
چکیده انگلیسی


• A new algorithm inspired by C4.5 and imprecise probabilities is defined.
• This algorithm (Credal-C4.5) assumes unreliable data sets when the tree is built.
• Pruning process is also incorporated to Credal-C4.5.
• Several experiments are made to compare algorithms.
• The new method is especially suitable to classify data sets with noise.

In the area of classification, C4.5 is a known algorithm widely used to design decision trees. In this algorithm, a pruning process is carried out to solve the problem of the over-fitting. A modification of C4.5, called Credal-C4.5, is presented in this paper. This new procedure uses a mathematical theory based on imprecise probabilities, and uncertainty measures. In this way, Credal-C4.5 estimates the probabilities of the features and the class variable by using imprecise probabilities. Besides it uses a new split criterion, called Imprecise Information Gain Ratio, applying uncertainty measures on convex sets of probability distributions (credal sets). In this manner, Credal-C4.5 builds trees for solving classification problems assuming that the training set is not fully reliable. We carried out several experimental studies comparing this new procedure with other ones and we obtain the following principal conclusion: in domains of class noise, Credal-C4.5 obtains smaller trees and better performance than classic C4.5.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 41, Issue 10, August 2014, Pages 4625–4637
نویسندگان
, ,