Article ID Journal Published Year Pages File Type
551974 Decision Support Systems 2016 15 Pages PDF
Abstract

•We assess the added value of leading and lagging information in sentiment analysis.•We analyze 17,697 Facebook status updates.•We use two classification algorithms, five times twofold cross-validation and the Friedman test.•Including leading and lagging data increases the AUC substantially.•These findings clearly indicate that including leading and lagging data is a viable strategy.

The purpose of this study is to (1) assess the added value of information available before (i.e., leading) and after (i.e., lagging) the focal post's creation time in sentiment analysis of Facebook posts, (2) determine which predictors are most important, and (3) investigate the relationship between top predictors and sentiment. We build a sentiment prediction model, including leading information, lagging information, and traditional post variables. We benchmark Random Forest and Support Vector Machines using five times twofold cross-validation. The results indicate that both leading and lagging information increase the model's predictive performance. The most important predictors include the number of uppercase letters, the number of likes and the number of negative comments. A higher number of uppercase letters and likes increases the likelihood of a positive post, while a higher number of comments increases the likelihood of a negative post. The main contribution of this study is that it is the first to assess the added value of leading and lagging information in the context of sentiment analysis.

Related Topics
Physical Sciences and Engineering Computer Science Information Systems
Authors
, , ,