On developing robust models for favourability analysis: Model choice, feature sets and imbalanced data

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
552282	873197	2012	7 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Sentiment analysis - تجزیه و تحلیل احساسات Imbalanced data - داده های نامتعادل Bayesian models - مدل های بیزی Machine learning - یادگیری ماشین

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر سیستم های اطلاعاتی

پیش نمایش صفحه اول مقاله

On developing robust models for favourability analysis: Model choice, feature sets and imbalanced data

چکیده انگلیسی

Locating documents carrying positive or negative favourability is an important application within media analysis. This article presents some empirical results on the challenges facing a machine-learning approach to this kind of opinion mining. Some of the challenges include the often considerable imbalance in the distribution of positive and negative samples, changes in the documents over time, and effective training and evaluation procedures for the models. This article presents results on three data sets generated by a media-analysis company, classifying documents in two ways: detecting the presence of favourability, and assessing negative vs. positive favourability. We describe our experiments in developing a machine-learning approach to automate the classification process. We explore the effect of using five different types of features, the robustness of the models when tested on data taken from a later time period, and the effect of balancing the input data by undersampling. We find varying choices for the optimum classifier, feature set and training strategy depending on the task and data set.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Decision Support Systems - Volume 53, Issue 4, November 2012, Pages 712–718

نویسندگان

Peter C.R. Lane, Daoud Clarke, Paul Hender,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

On developing robust models for favourability analysis: Model choice, feature sets and imbalanced data

دسترسی سریع

ارتباط

English Website