Article ID Journal Published Year Pages File Type
10355181 Information Processing & Management 2015 22 Pages PDF
Abstract
With the rise of Web 2.0 platforms, personal opinions, such as reviews, ratings, recommendations, and other forms of user-generated content, have fueled interest in sentiment classification in both academia and industry. In order to enhance the performance of sentiment classification, ensemble methods have been investigated by previous research and proven to be effective theoretically and empirically. We advance this line of research by proposing an enhanced Random Subspace method, POS-RS, for sentiment classification based on part-of-speech analysis. Unlike existing Random Subspace methods using a single subspace rate to control the diversity of base learners, POS-RS employs two important parameters, i.e. content lexicon subspace rate and function lexicon subspace rate, to control the balance between the accuracy and diversity of base learners. Ten publicly available sentiment datasets were investigated to verify the effectiveness of proposed method. Empirical results reveal that POS-RS achieves the best performance through reducing bias and variance simultaneously compared to the base learner, i.e., Support Vector Machine. These results illustrate that POS-RS can be used as a viable method for sentiment classification and has the potential of being successfully applied to other text classification problems.
Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , , , ,