کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
535330 870341 2014 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Feature selection using Principal Component Analysis for massive retweet detection
ترجمه فارسی عنوان
انتخاب ویژگی با استفاده از تجزیه و تحلیل مولفه اصلی برای تشخیص عظیم پاسخ دهی
کلمات کلیدی
مجله عظیم تجزیه و تحلیل اجزای اصلی، انتخاب ویژگی، طبقه بندی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی

Social networks become a major actor in massive information propagation. In the context of the Twitter platform, its popularity is due in part to the capability of relaying messages (i.e. tweets) posted by users. This particular mechanism, called retweet, allows users to massively share tweets they consider as potentially interesting for others. In this paper, we propose to study the behavior of tweets that have been massively retweeted in a short period of time. We first analyze specific tweet features through a Principal Component Analysis (PCA) to better understand the behavior of highly forwarded tweets as opposed to those retweeted only a few times. Finally, we propose to automatically detect the massively retweeted messages. The qualitative study is used to select the features allowing the best classification performance. We show that the selection of only the most correlated features, leads to the best classification accuracy (F-measure of 65.7%), with a gain of about 2.4 points in comparison to the use of the complete set of features.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 49, 1 November 2014, Pages 33–39
نویسندگان
, , , , ,