Article ID Journal Published Year Pages File Type
558525 Computer Speech & Language 2009 15 Pages PDF
Abstract

This paper proposes a semi-supervised learning method for semantic relation extraction between named entities. Given a small amount of labeled data, it benefits much from a large amount of unlabeled data by first bootstrapping a moderate number of weighted support vectors from all the available data through a co-training procedure on top of support vector machines (SVM) with feature projection and then applying a label propagation (LP) algorithm via the bootstrapped support vectors and the remaining hard unlabeled instances after SVM bootstrapping to classify unseen instances. Evaluation on the ACE RDC corpora shows that our method can integrate the advantages of both SVM bootstrapping and label propagation. It shows that our LP algorithm via the bootstrapped support vectors and hard unlabeled instances significantly outperforms the normal LP algorithm via all the available data without SVM bootstrapping. Moreover, our LP algorithm can significantly reduce the computational burden, especially when a large amount of labeled and unlabeled data is taken into consideration.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , ,