Boosting image sentiment analysis with visual attention

Article ID	Journal	Published Year	Pages	File Type
6863560	Neurocomputing	2018	33 Pages	PDF

Abstract

Sentiment analysis plays an important role in behavior sciences, which aims to determine the attitude of a speaker or a writer regarding some topic or the overall contextual polarity of a document. The problem nevertheless is not trivial, especially when inferring sentiment or emotion from visual contents, such as images and videos, which are becoming pervasive on the Web. Observing that the sentiment of an image may be reflected only by some spatial regions, a valid question is how to locate the attended spatial areas for enhancing image sentiment analysis. In this paper, we present Sentiment Networks with visual Attention (SentiNet-A) - a novel architecture that integrates visual attention into the successful Convolutional Neural Networks (CNN) sentiment classification framework, by training them in an end-to-end manner. To model visual attention, we develop multiple layers to generate the attention distribution over the regions of the image. Furthermore, the saliency map of the image is employed as a priori knowledge and regularizer to holistically refine the attention distribution for sentiment prediction. Extensive experiments are conducted on both Twitter and ARTphoto benchmarks, and our framework achieves superior results when compared to the state-of-the-art techniques.

Keywords

Saliency detection visual attention Convolutional neural networks