Article ID Journal Published Year Pages File Type
531377 Pattern Recognition 2010 10 Pages PDF
Abstract

The performance of mm-out-of-nn bagging with and without replacement in terms of the sampling ratio (m/n)(m/n) is analyzed. Standard bagging uses resampling with replacement to generate bootstrap samples of equal size as the original training set mwor=nmwor=n. Without-replacement methods typically use half samples mwr=n/2mwr=n/2. These choices of sampling sizes are arbitrary and need not be optimal in terms of the classification performance of the ensemble. We propose to use the out-of-bag estimates of the generalization accuracy to select a near-optimal value for the sampling ratio. Ensembles of classifiers trained on independent samples whose size is such that the out-of-bag error of the ensemble is as low as possible generally improve the performance of standard bagging and can be efficiently built.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,