Article ID Journal Published Year Pages File Type
4947474 Neurocomputing 2017 23 Pages PDF
Abstract
Many real-world machine learning tasks have very limited labeled data but a large amount of unlabeled data. To take advantage of the unlabeled data for enhancing learning performance, several semi-supervised learning techniques have been developed. In this paper, we propose a novel semi-supervised ensemble learning algorithm, termed Multi-Train, which generates a number of heterogeneous classifiers that use different classification models and/or different features. During the training process, each classifier is refined using unlabeled data, which are labeled by the majority prediction of the rest classifiers. We hypothesize that the use of different models and different input features can promote the diversity of the ensemble, thereby improving the performance compared to existing methods such as the co-training and tri-training algorithms. Experimental results on the UCI datasets clearly demonstrated the effectiveness of using heterogeneous ensembles in semi-supervised learning.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, ,