کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6872788 1440624 2018 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Machine learning based heterogeneous web advertisements detection using a diverse feature set
ترجمه فارسی عنوان
تشخیص ماشین با استفاده از مجموعه ای از ویژگی های متنوع از طریق شناسایی آگهی های وب یکپارچه می شود
کلمات کلیدی
آگهی ها، دسترسی به وب، محتوای تصادفی جنگل تصادفی، فراگیری ماشین،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی
Advertisement identification and filtering in web pages gain significance due to various factors such as accessibility, security, privacy, and obtrusiveness. Current practices in this direction involve maintaining URL-based regular expressions called filter lists. Each URL obtained on a web page is matched against this filter list. While effectual, this procedure lacks scalability as it demands regular continuance of the filter list. To counter these limitations, we devise a machine learning based advertisement detection system using a diverse feature set which can distinguish advertisement blocks from non-advertisement blocks. The method can act as a base to provide various accessibility-related features like smooth browsing and text summarization for persons with visual impairments, cognitive impairments, and photosensitive epilepsy. The results from a classifier trained on the proposed feature set achieve 98.6% accuracy in identifying advertisements.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 89, December 2018, Pages 68-77
نویسندگان
, ,