Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6854967 | Expert Systems with Applications | 2018 | 45 Pages |
Abstract
Decision tree learning algorithms are known to be unstable, because small changes in the training data can result in highly different decision trees. An important issue is how to quantify decision tree stability. Two types of stability are defined in the literature: structural and semantic stability. However, existing structural stability measures are meaningless when applied to apparently different decision trees, and semantic stability only focuses on prediction accuracy without considering structural information. This paper proposes a region compatibility based structural stability measure for decision trees that considers the structural distribution of leaves from the view of basic probability assignments in evidence theory. To the best of our knowledge, we are the first to use basic probability assignments to quantify decision tree stability. We prove convergence for region compatibility, and show that apparently different decision trees have some inherent similarity from the view of region compatibility. We also clarify the meaning of region compatibility for measuring decision tree stability, and derive a method to select a relatively stable learning algorithm for a given dataset. Experimental results validate that region compatibility is effective to quantify the stability of decision tree learning algorithms.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Lihong Wang, Qiang Li, Yanwei Yu, Jinglei Liu,