Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6941381 | Pattern Recognition Letters | 2014 | 12 Pages |
Abstract
The aim of this paper is to present an incremental selection strategy by which the classification accuracy of semi-supervised learning (SSL) algorithms can be improved. In SSL, both a limited number of labeled and a multitude of unlabeled data are utilized to learn a classification model. However, it is also well known that the utilization of the unlabeled data is not always helpful for SSL algorithms. To efficiently use them in learning the classification model, some of the unlabeled data that are deemed useful for the learning process are selected and given the correctly estimated labels. To address this problem, especially when dealing with semi-supervised MarginBoost (SSMB) algorithm (d'Alché-Buc et al., 2002), in this paper, two selection strategies, named simply recycled selection and incrementally reinforced selection, are considered and empirically compared. Our experimental results, obtained with well-known benchmark data sets, including SSL-type benchmarks and some UCI data sets, demonstrate that the latter, i.e., selecting only a small portion of strong examples from the available unlabeled data in an incremental fashion, can compensate for the shortcomings of the existing SSMB algorithm. Moreover, compared to the former, it generally achieves better classification accuracy results.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Vision and Pattern Recognition
Authors
Thanh-Binh Le, Sang-Woon Kim,