Article ID Journal Published Year Pages File Type
1148556 Journal of Statistical Planning and Inference 2007 13 Pages PDF
Abstract
Using 1998 and 1999 singleton birth data of the State of Florida, we study the stability of classification trees. Tree stability depends on both the learning algorithm and the specific data set. In this study, test samples are used in statistical learning to evaluate both stability and predictive performance. We also use the resampling technique bootstrap, which can be regarded as data self-perturbation, to evaluate the sensitivity of the modeling algorithm with respect to the specific data set. We demonstrate that the selection of the cost function plays an important role in stability. In particular, classifiers with equal misclassification costs and equal priors are less stable compared to those with unequal misclassification costs and equal priors.
Related Topics
Physical Sciences and Engineering Mathematics Applied Mathematics
Authors
, , ,