Semi-supervised classification method through oversampling and common hidden space

Article ID	Journal	Published Year	Pages	File Type
392291	Information Sciences	2016	13 Pages	PDF

Abstract

Semi-supervised classification methods attempt to improve classification performance based on a small amount of labeled data through full use of abundant unlabeled data. Although existing semi-supervised classification methods have exhibited promising results in many applications, they still have drawbacks, including performance degeneration, due to the introduction of unlabeled data and partially false labels in a small amount of labeled data. To circumvent such drawbacks, a new semi-supervised classification method OCHS-SSC through oversampling and a common hidden space is proposed in the paper. The primary characteristics of the proposed method include two aspects. One is that unlabeled data are only used to generate new synthetic data to extend the minimal amount of labeled data. The other is that the final classifier is learned in the extended feature space, which is composed of the original feature space and the common hidden space found between labeled data and the synthetic data instead of the original feature space. Extensive experiments on 23 datasets indicate the effectiveness of the proposed method.

Keywords

Oversampling Semi-supervised classification