Article ID Journal Published Year Pages File Type
4944986 Information Sciences 2016 28 Pages PDF
Abstract
Word representation is crucial to lexical features used in Twitter sentiment analysis models. Recent work has demonstrated that dense, low-dimensional and real-valued word embedding gives competitive performance for Twitter sentiment classification. We follow this line of work, and propose a topic-enhanced word embedding for the task, which is generally neglected in previous work. Firstly, we exploit a recursive autoencoder framework to learn topic-enhanced word embedding, where topic information is generated through topic modeling based on an effective implementation of Latent Dirichlet Allocation (LDA). Then we use a uniform framework by adopting Support Vector Machine (SVM) classifier, to compare existing word representation methods with our method. Experimental results on the dataset show that topic-enhanced word embedding is very effective for Twitter sentiment classification.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,