دانلود رایگان مقاله: ساخت واژگان افکار عمومی توییتر از توییت های به طور خودکار تفسیرشده

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
571788	1439293	2016	14 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Building a Twitter opinion lexicon from automatically-annotated tweets

ترجمه فارسی عنوان

ساخت واژگان افکار عمومی توییتر از توییت های به طور خودکار تفسیرشده

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

گسترش واژگان؛ تجزیه و تحلیل احساسات؛ توییتر

Sentiment analysis - تجزیه و تحلیل احساسات Twitter - توییتر

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش مقاله

ساخت واژگان افکار عمومی توییتر از توییت های به طور خودکار تفسیرشده

چکیده انگلیسی

• We propose a supervised model for expanding an opinion lexicon for Twitter.
• We combine automatically annotated tweets with existing hand-made opinion lexicons.
• We use POS tags and associations between words and sentiment as word-level features.
• Expanded words are mapped to a positive, negative, and neutral distribution.
• We outperform the performance obtained by using PMI semantic orientation alone.

Opinion lexicons, which are lists of terms labeled by sentiment, are widely used resources to support automatic sentiment analysis of textual passages. However, existing resources of this type exhibit some limitations when applied to social media messages such as tweets (posts in Twitter), because they are unable to capture the diversity of informal expressions commonly found in this type of media.In this article, we present a method that combines information from automatically annotated tweets and existing hand-made opinion lexicons to expand an opinion lexicon in a supervised fashion. The expanded lexicon contains part-of-speech (POS) disambiguated entries with a probability distribution for positive, negative, and neutral polarity classes, similarly to SentiWordNet.To obtain this distribution using machine learning, we propose word-level attributes based on (a) the morphological information conveyed by POS tags and (b) associations between words and the sentiment expressed in the tweets that contain them. We consider tweets with both hard and soft sentiment labels. The sentiment associations are modeled in two different ways: using point-wise-mutual-information semantic orientation (PMI-SO), and using stochastic gradient descent semantic orientation (SGD-SO), which learns a linear relationship between words and sentiment. The training dataset is labeled by a seed lexicon formed by combining multiple hand-annotated lexicons.Our experimental results show that our method outperforms the three-dimensional word-level polarity classification performance obtained by using PMI-SO alone. This is significant because PMI-SO is a state-of-the-art measure for establishing world-level sentiment. Additionally, we show that lexicons created with our method achieve significant improvements over SentiWordNet for classifying tweets into polarity classes, and also outperform SentiStrength in the majority of the experiments.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 108, 15 September 2016, Pages 65–78

نویسندگان

Felipe Bravo-Marquez, Eibe Frank, Bernhard Pfahringer,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : ساخت واژگان افکار عمومی توییتر از توییت های به طور خودکار تفسیرشده

دسترسی سریع

ارتباط

English Website