Article ID Journal Published Year Pages File Type
6855025 Expert Systems with Applications 2018 14 Pages PDF
Abstract
Sentiment analysis helps evaluating the performance of products or services from user generated contents. Lexicon based sentiment analysis approaches are preferred over learning based ones when training data is not adequate. Existing lexicons contain only unigrams along with their sentiment scores. It is observed that sentiment n-grams formed by combining unigrams with intensifiers or negations show improved results. Such sentiment n-gram lexicons are not publicly available. This paper presents a methodology to create such a lexicon called Senti-N-Gram. Proposed rule-based approach extracts the n-grams sentiment scores from a random corpus containing product reviews and corresponding numeric rating in five-point scale. The scores from this automated procedure are compared with that of the human annotators using t-test and found to be statistically equivalent. The paper also proposes a sentiment classification methodology by using a ratio based approach based on counts of positive and negative sentences of a document. When used Senti-N-Gram lexicon, proposed method outperforms well-known unigram-lexicon based approach using VADER and an n-gram sentiment analysis approach SO-CAL.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,