کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
558285 874892 2014 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
SAMAR: Subjectivity and sentiment analysis for Arabic social media
ترجمه فارسی عنوان
سمر: تحلیل ذهنی و احساسات برای رسانه های اجتماعی عرب
کلمات کلیدی
تحلیل ذهنی و احساسات، مورفولوژی غنی زبان، عربی داده های رسانه های اجتماعی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• We present a system for subjectivity and sentiment analysis (SSA) for Arabic social media data.
• Individual settings are required per genre and task.
• Using either lemmas or lexemes improves SSA results.
• Using a POS tagset leads improved results, as do standard features.
• Processing dialects does not improve when it is know which sentences are in dialect.
• Genre specific features tend to be helpful for sentiment analysis, but not for subjectivity.

SAMAR is a system for subjectivity and sentiment analysis (SSA) for Arabic social media genres. Arabic is a morphologically rich language, which presents significant complexities for standard approaches to building SSA systems designed for the English language. Apart from the difficulties presented by the social media genres processing, the Arabic language inherently has a high number of variable word forms leading to data sparsity. In this context, we address the following 4 pertinent issues: how to best represent lexical information; whether standard features used for English are useful for Arabic; how to handle Arabic dialects; and, whether genre specific features have a measurable impact on performance. Our results show that using either lemma or lexeme information is helpful, as well as using the two part of speech tagsets (RTS and ERTS). However, the results show that we need individualized solutions for each genre and task, but that lemmatization and the ERTS POS tagset are present in a majority of the settings.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 28, Issue 1, January 2014, Pages 20–37
نویسندگان
, , ,