کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6854828 | 1437596 | 2018 | 13 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis
ترجمه فارسی عنوان
ارزیابی مقایسه ای از تکنیک های پیش پردازش و تعاملات آنها برای تحلیل احساسات توییتر
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
چکیده انگلیسی
Pre-processing is the first step in text classification, and choosing right pre-processing techniques can improve classification effectiveness. We experimentally compare 16 commonly used pre-processing techniques on two Twitter datasets for Sentiment Analysis, employing four popular machine learning algorithms, namely, Linear SVC, Bernoulli Naïve Bayes, Logistic Regression, and Convolutional Neural Networks. We evaluate the pre-processing techniques on their resulting classification accuracy and number of features they produce. We find that techniques like lemmatization, removing numbers, and replacing contractions, improve accuracy, while others like removing punctuation do not. Finally, in order to investigate interactions-desirable or otherwise-between the techniques when they are employed simultaneously in a pipeline fashion, an ablation and combination study is contacted. The results of ablation and combination show the significance of techniques such as replacing numbers and replacing repetitions of punctuation.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 110, 15 November 2018, Pages 298-310
Journal: Expert Systems with Applications - Volume 110, 15 November 2018, Pages 298-310
نویسندگان
Symeon Symeonidis, Dimitrios Effrosynidis, Avi Arampatzis,