Detecting malicious tweets in trending topics using a statistical analysis of language

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
383951	660837	2013	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Spam detection - تشخیص هرزنامه Social network - شبکه اجتماعی Machine learning - یادگیری ماشین

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

Detecting malicious tweets in trending topics using a statistical analysis of language

چکیده انگلیسی

Twitter spam detection is a recent area of research in which most previous works had focused on the identification of malicious user accounts and honeypot-based approaches. However, in this paper we present a methodology based on two new aspects: the detection of spam tweets in isolation and without previous information of the user; and the application of a statistical analysis of language to detect spam in trending topics. Trending topics capture the emerging Internet trends and topics of discussion that are in everybody’s lips. This growing microblogging phenomenon therefore allows spammers to disseminate malicious tweets quickly and massively. In this paper we present the first work that tries to detect spam tweets in real time using language as the primary tool. We first collected and labeled a large dataset with 34 K trending topics and 20 million tweets. Then, we have proposed a reduced set of features hardly manipulated by spammers. In addition, we have developed a machine learning system with some orthogonal features that can be combined with other sets of features with the aim of analyzing emergent characteristics of spam in social networks. We have also conducted an extensive evaluation process that has allowed us to show how our system is able to obtain an F-measure at the same level as the best state-of-the-art systems based on the detection of spam accounts. Thus, our system can be applied to Twitter spam detection in trending topics in real time due mainly to the analysis of tweets instead of user accounts.

► Analysis of tweets instead of user accounts as most previous works did.
► Detection of spam tweets in isolation and without previous information of the user.
► Novel use of language analysis to extract some features hardly manipulated by spammers.
► New features which are an orthogonal representation of each Tweet.
► Our system can be applied to twitter spam detection in trending topics in real time.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 40, Issue 8, 15 June 2013, Pages 2992–3000

نویسندگان

Juan Martinez-Romo, Lourdes Araujo,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Detecting malicious tweets in trending topics using a statistical analysis of language

دسترسی سریع

ارتباط

English Website