Article ID Journal Published Year Pages File Type
4946067 Knowledge-Based Systems 2017 30 Pages PDF
Abstract
The Random Forest classifier has been considered as an important reference in the data mining area. The building procedure of its base classifier (a decision tree) is principally based on a randomization process of data and features; and on a split criterion, which uses classic precise probabilities, to quantify the gain of information. One drawback found on this classifier is that it has a bad performance when it is applied on data sets with class noise. Very recently, it is proved that a new criterion which uses imprecise probabilities and general uncertainty measures, can improve the performance of the classic split criteria. In this work, the base classifier of the Random Forest is modified using that new criterion, producing also a new single decision tree model. This model join with the randomization process of features is the base classifier of a new procedure similar to the Random Forest, called Credal Random Forest. The principal differences between those two models are presented. In an experimental study, it is shown that the new method represents an improvement of the Random Forest when both are applied on data sets without class noise. But this improvement is notably greater when they are applied on data sets with class noise.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,