Article ID Journal Published Year Pages File Type
4969259 Journal of Visual Communication and Image Representation 2017 15 Pages PDF
Abstract
A huge amount of text and multimedia (images and videos) data concerning venues is constantly being generated. To model the semantics of these venues, it is essential to analyze both text and multimedia user-generated content (UGC) in an integral manner. This task, however, is difficult for location-based social networks (LBSNs) because their text and multimedia UGCs tend to be uncorrelated. In this paper, we propose a novel multimedia location topic modeling approach to address this problem. We first utilize Recurrent Convolutional Networks to build the correlation between multimedia UGCs and text. Then, a graph model is structured according to these correlations. Next, we employ a graph clustering method to detect the latent multimedia topics for each venue. Based on the obtained venue semantics, we propose techniques to model multimedia location topics and perform semantic-based location summarization, venue prediction and image description. Extensive experiments are conducted on a cross-platform dataset, and the promising results demonstrate the superiority of the proposed method.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , , ,