Article ID Journal Published Year Pages File Type
4969623 Pattern Recognition 2017 13 Pages PDF
Abstract
Multi-modal data modeling lately has been an active research area in pattern recognition community. Existing studies mainly focus on modeling the content of multi-modal documents, whilst the links amongst documents are commonly ignored. However, link information has shown being of key importance in many applications, such as document navigation, classification, and clustering. In this paper, we present a non-probabilistic formulation of Relational Topic Model (RTM), i.e., Sparse Relational Multi-Modal Topical Coding (SRMMTC), to model both multi-modal documents and the corresponding link information. SRMMTC has the following three appealing properties: i) It can effectively produce sparse latent representations via directly imposing sparsity-inducing regularizers. ii) It handles the imbalance issues on multi-modal data collections by introducing regularization parameters for positive and negative links, respectively; iii) It can be solved by an efficient coordinate descent algorithm. We also explore a generalized version of SRMMTC to find pairwise interactions amongst topics. Our methods are also capable of performing link prediction for documents, as well as the prediction of annotation words for attendant images in documents. Empirical studies on a set of benchmark datasets show that our proposed models significantly outperform many state-of-the-art methods.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , , ,