A novel automatic satire and irony detection using ensembled feature selection and data mining

Article ID	Journal	Published Year	Pages	File Type
4946161	Knowledge-Based Systems	2017	40 Pages	PDF

Abstract

Figurative language detection has always been a difficult task for human beings while being a more difficult proposition, even if automated using text and data mining. The available computational approaches are also quite limited in their capabilities and scope. In this regard, we propose an ensembled text feature selection method followed by a new framework in the paradigm of text and data mining to automatically detect satire, sarcasm, and irony found in news and customer reviews. The effectiveness of the proposed approach was demonstrated on three datasets including two satiric and one ironic dataset. The proposed methodology performed well on one satiric dataset and yielded promising results on the remaining two datasets. Moreover, we found out some interesting common characteristics of satire and irony like affective process (negative emotion), personal concern (leisure), biological process (body and sexual), perception (see), informal language (swear), social process (male), cognitive process (certain), and psycholinguistic (concreteness and imageability), which were extracted from three corpora. Of particular significance is the comparison of our approach with human annotators' evaluations, which served as a baseline in these tasks.

Keywords

LIWC Irony detection Customer reviews Sentiment analysis