Sparse Relational Topical Coding on multi-modal data

Article ID	Journal	Published Year	Pages	File Type
4969623	Pattern Recognition	2017	13 Pages	PDF

Abstract

Multi-modal data modeling lately has been an active research area in pattern recognition community. Existing studies mainly focus on modeling the content of multi-modal documents, whilst the links amongst documents are commonly ignored. However, link information has shown being of key importance in many applications, such as document navigation, classification, and clustering. In this paper, we present a non-probabilistic formulation of Relational Topic Model (RTM), i.e., Sparse Relational Multi-Modal Topical Coding (SRMMTC), to model both multi-modal documents and the corresponding link information. SRMMTC has the following three appealing properties: i) It can effectively produce sparse latent representations via directly imposing sparsity-inducing regularizers. ii) It handles the imbalance issues on multi-modal data collections by introducing regularization parameters for positive and negative links, respectively; iii) It can be solved by an efficient coordinate descent algorithm. We also explore a generalized version of SRMMTC to find pairwise interactions amongst topics. Our methods are also capable of performing link prediction for documents, as well as the prediction of annotation words for attendant images in documents. Empirical studies on a set of benchmark datasets show that our proposed models significantly outperform many state-of-the-art methods.

Keywords

Multi-modal data Image annotation Link prediction