کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
10139438 1645964 2018 35 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Deep multi-view representation learning for social images
ترجمه فارسی عنوان
آموزش نمایشی عمیق چندین نمایه برای تصاویر اجتماعی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی
Multi-view representation learning for social images has recently made remarkable achievements in many tasks, such as cross-view classification and cross-modal retrieval. Since social images usually contain link information besides the multi-modal contents (e.g., text description, and visual content), simply employing the data content may result in sub-optimal multi-view representation of the social images. In this paper, we propose a Deep Multi-View Embedding Model (DMVEM) to learn joint embeddings for the three views including the visual content, the associated text descriptions, and their relations. To effectively encode the link information, a weighted relation network is built based on the linkages between social images, which is then embedded into a low dimensional vector space using the Skip-Gram model. The learned vector is regarded as the third view besides the visual content and text description. To learn a joint representation from the three views, a deep learning model with three-branch nonlinear neural network is proposed. A three-view bi-directional loss function is used to capture the correlation between the three views. The stacked autoencoder is adopted to preserve the self-structure and reconstructability of the learned representation for each view. Comprehensive experiments are conducted in the tasks of image-to-text, text-to-image, and image-to-image searches. Compared to the state-of-the-art multi-view embedding methods, our approach achieves significant improvement of performance.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Soft Computing - Volume 73, December 2018, Pages 106-118
نویسندگان
, , , , ,