کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
392595 664991 2016 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Using unsupervised information to improve semi-supervised tweet sentiment classification
ترجمه فارسی عنوان
استفاده از اطلاعات ناخواسته برای بهبود بخشیدن به احساسات صدای نیمه نظارت شده
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Supervised algorithms require a set of representative labeled data for building classification models. However, labeled data are usually difficult and expensive to obtain, which motivates the interest in semi-supervised learning. This type of learning uses both labeled and unlabeled data in the training process and is particularly useful in applications such as tweet sentiment analysis, where a large amount of unlabeled data is available. Semi-supervised learning for tweet sentiment analysis, although quite appealing, is relatively new. We propose a semi-supervised learning framework that combines unsupervised information, captured from a similarity matrix constructed from unlabeled data, with a classifier. Our motivation is that such a similarity matrix is a powerful knowledge-discovery tool that can help classify unlabeled tweet sets. Our framework makes use of the well-known Self-training algorithm to induce a better tweet sentiment classifier. Experimental results in real-world datasets demonstrate that the proposed framework can improve the accuracy of tweet sentiment analysis.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volumes 355–356, 10 August 2016, Pages 348–365
نویسندگان
, , , ,