دانلود رایگان مقاله: به سوی فیلتر کردن پیام های کوتاه ناخواسته با استفاده از رویکرد یادگیری آنلاین با نمایه سازی معنایی

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4943094	1437623	2017	42 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Towards filtering undesired short text messages using an online learning approach with semantic indexing

ترجمه فارسی عنوان

به سوی فیلتر کردن پیام های کوتاه ناخواسته با استفاده از رویکرد یادگیری آنلاین با نمایه سازی معنایی

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

حداقل طول شرح، پیام کوتاه متن نمایه سازی معنایی، طبقه بندی متن، فراگیری ماشین، 00-01، 99-00،

00-01 99-00 Minimum Description Length - حداقل طول شرح Text categorization - طبقه بندی متن Semantic indexing - نمایه سازی معنایی Machine learning - یادگیری ماشین

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش مقاله

به سوی فیلتر کردن پیام های کوتاه ناخواسته با استفاده از رویکرد یادگیری آنلاین با نمایه سازی معنایی

چکیده انگلیسی

The popularity and reach of short text messages commonly used in electronic communication have led spammers to use them to propagate undesired content. This is often composed by misleading information, advertisements, viruses, and malwares that can be harmful and annoying to users. The dynamic nature of spam messages demands for knowledge-based systems with online learning and, therefore, the most traditional text categorization techniques can not be used. In this study, we introduce the MDLText, a text classifier based on the minimum description length principle, to the context of filtering undesired short text messages. The proposed approach supports incremental learning and, therefore, its predictive model is scalable and can adapt to continuously evolving spamming techniques. It is also fast, with computational cost increasing linearly with the number of samples and features, which is very desirable for expert systems applied to real-time electronic communication. In addition to the dynamic nature of these messages, they are also short and usually poorly written, rife with slangs, symbols, and abbreviations that difficult text representation, learning, and filtering. In this scenario, we also investigated the benefits of using text normalization and semantic indexing techniques. We showed these techniques can improve the text content quality and, consequently, enhance the performance of the expert systems for spamming detection. Based on these findings, we propose a new hybrid ensemble approach that combines the predictions obtained by the classifiers using the original text samples along with their variations created by applying text normalization and semantic indexing techniques. It has the advantages of being independent of the classification method and the results indicated it is efficient to filter undesired short text messages.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 83, 15 October 2017, Pages 314-325

نویسندگان

Renato M. Silva, Tulio C. Alberto, Tiago A. Almeida, Akebo Yamakami,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : به سوی فیلتر کردن پیام های کوتاه ناخواسته با استفاده از رویکرد یادگیری آنلاین با نمایه سازی معنایی

دسترسی سریع

ارتباط

English Website