کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6865977 679603 2015 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Spam filtering for short messages in adversarial environment
ترجمه فارسی عنوان
هرزنامه برای پیام کوتاه در محیط رقابتی فیلتر شده است
کلمات کلیدی
فیلتر کردن هرزنامه، پیام کوتاه، حمله به کلمه خوب ویژگی مجدد وزن، یادگیری آداب و رسوم،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
The unsolicited bulk messages are widespread in the applications of short messages. Although the existing spam filters have satisfying performance, they are facing the challenge of an adversary who misleads the spam filters by manipulating samples. Until now, the vulnerability of spam filtering technique for short messages has not been investigated. Different from the other spam applications, a short message only has a few words and its length usually has an upper limit. The current adversarial learning algorithms may not work efficiently in short message spam filtering. In this paper, we investigate the existing good word attack and its counterattack method, i.e. the feature reweighting, in short message spam filtering in an effort to understand whether, and to what extent, they can work efficiently when the length of a message is limited. This paper proposes a good word attack strategy which maximizes the influence to a classifier with the least number of inserted characters based on the weight values and also the length of words. On the other hand, we also proposes the feature reweighting method with a new rescaling function which minimizes the importance of the feature representing a short word in order to require more inserted characters for a successful evasion. The methods are evaluated experimentally by using the SMS and the comment spam dataset. The results confirm that the length of words is a critical factor of the robustness of short message spam filtering to good word attack.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 155, 1 May 2015, Pages 167-176
نویسندگان
, , , ,