Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
1148556 | Journal of Statistical Planning and Inference | 2007 | 13 Pages |
Abstract
Using 1998 and 1999 singleton birth data of the State of Florida, we study the stability of classification trees. Tree stability depends on both the learning algorithm and the specific data set. In this study, test samples are used in statistical learning to evaluate both stability and predictive performance. We also use the resampling technique bootstrap, which can be regarded as data self-perturbation, to evaluate the sensitivity of the modeling algorithm with respect to the specific data set. We demonstrate that the selection of the cost function plays an important role in stability. In particular, classifiers with equal misclassification costs and equal priors are less stable compared to those with unequal misclassification costs and equal priors.
Keywords
Related Topics
Physical Sciences and Engineering
Mathematics
Applied Mathematics
Authors
Panagiota Kitsantas, Myles Hollander, Lei M. Li,