Article ID Journal Published Year Pages File Type
410340 Neurocomputing 2013 11 Pages PDF
Abstract

Annotating class labels of a large number of time-series data is generally an expensive task. We propose novel semi-supervised learning algorithms that can improve the classification accuracy significantly by exploiting a relatively larger amount of unlabeled data in conjunction with a few labeled samples. Our algorithms utilize the unlabeled data as regularizers for opting for classifiers with stronger certainty on the unlabeled data. For the state-of-the-art conditional probabilistic sequence model called the hidden conditional random field, we first suggest the entropy minimization algorithm that was previously applied for static classification setups. More sophisticated margin-based approaches are then introduced, motivated by the semi-supervised support vector machines originally aimed for non-sequential data. We provide effective ways to incorporate and minimize the hat loss function for sequence data via probabilistic treatment in a principled manner. We show the performance improvement achieved by our methods on several semi-supervised time-series data classification scenarios.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
,