Article ID Journal Published Year Pages File Type
6941216 Pattern Recognition Letters 2015 8 Pages PDF
Abstract
Supervised topic models such as labeled latent Dirichlet allocation (L-LDA) have attracted increased attention for multi-label classification. However, they lack considerations of the label frequency of the word (i.e., the number of labels containing the word), which is crucial for classification. To address this problem, we investigate the L-LDA model and then propose an extension, namely centroid prior topic model (CTPM). Class-feature-centroid (CFC) suggests a discriminative label-word vector that takes the label frequency of the word into account. CPTM uses this CFC vector as prior for label-word distributions. Extensive experiments on the Yahoo! dataset have been conducted to evaluate our algorithm. The experimental results demonstrate that CPTM outperforms the existing multi-label classification algorithms on AUC, Macro-F1 and Micro-F1.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , ,