کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
515838 867108 2014 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Supervised sentiment analysis in Czech social media
ترجمه فارسی عنوان
نظارت بر تحلیل احساسات در رسانه های اجتماعی چک
کلمات کلیدی
تجزیه و تحلیل احساسات، زبان چک، رسانه های اجتماعی، فراگیری ماشین، انتخاب ویژگی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• We explore state-of-the-art supervised machine learning methods for sentiment analysis of Czech social media.
• We provide a large human-annotated Czech social media corpus.
• We explore different pre-processing techniques and employ various features and classifiers.
• We experiment with five different feature selection algorithms.
• Results are also reported on other widely popular domains, such as movie and product reviews.

This article describes in-depth research on machine learning methods for sentiment analysis of Czech social media. Whereas in English, Chinese, or Spanish this field has a long history and evaluation datasets for various domains are widely available, in the case of the Czech language no systematic research has yet been conducted. We tackle this issue and establish a common ground for further research by providing a large human-annotated Czech social media corpus. Furthermore, we evaluate state-of-the-art supervised machine learning methods for sentiment analysis. We explore different pre-processing techniques and employ various features and classifiers. We also experiment with five different feature selection algorithms and investigate the influence of named entity recognition and preprocessing on sentiment classification performance. Moreover, in addition to our newly created social media dataset, we also report results for other popular domains, such as movie and product reviews. We believe that this article will not only extend the current sentiment analysis research to another family of languages, but will also encourage competition, potentially leading to the production of high-end commercial solutions.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 50, Issue 5, September 2014, Pages 693–707
نویسندگان
, , ,