کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6854943 | 1437600 | 2018 | 46 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
A domain transferable lexicon set for Twitter sentiment analysis using a supervised machine learning approach
ترجمه فارسی عنوان
یک واژگان قابل انتقال برای سنجش احساسات توییتر با استفاده از روش یادگیری دستگاه نظارت شده
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
چکیده انگلیسی
The Twitter messaging service has become a platform for customers and news consumers to express sentiments. Accurately capturing these sentiments has been challenging for researchers. The traditional approaches to Twitter Sentiment Analysis (TSA) include dictionary-based and use supervised machine learning tools for sentiment classification. This research follows the supervised machine learning approach. A major challenge for the machine learning approach is feature selection, which is often domain dependent. We address this specific challenge and present a novel approach to identify a lexicon set unique to TSA. We show that this Twitter Specific Lexicon Set (TSLS) is small, and most importantly, is domain transferable. This identification process generates a collection of vectorized tweets for input to machine learning tools. In traditional approaches, this vectorization often results in a highly sparse input matrix which produces low accuracy measures. In this research, we hierarchically reduce the feature set to a small set of seven “meta features” to reduce sparsity. We show that TSA based on these features can produce highly accurate results using a dynamic architecture for neural networks (DAN2) and SVM (machine learning tools) as measured by recall, precision, and F1 metrics (the harmonic average of precision and recall). Our results show that a Twitter Generic Feature Set (TGFS) derived from two datasets (@JustinBieber and @Starbucks) is domain transferable and when combined with only a few Twitter Domain Specific Features (TDSF) (less than 3%), can produce excellent sentiment classification values. We evaluate the effectiveness and transferability of the TGFS across three new and distinct domains (@GovChristie, @SouthwestAir, and @VerizonWireless).
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 106, 15 September 2018, Pages 197-216
Journal: Expert Systems with Applications - Volume 106, 15 September 2018, Pages 197-216
نویسندگان
M. Ghiassi, S. Lee,