کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
452381 694517 2009 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Web robot detection: A probabilistic reasoning approach
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Web robot detection: A probabilistic reasoning approach
چکیده انگلیسی

In this paper, we introduce a probabilistic modeling approach for addressing the problem of Web robot detection from Web-server access logs. More specifically, we construct a Bayesian network that classifies automatically access log sessions as being crawler- or human-induced, by combining various pieces of evidence proven to characterize crawler and human behavior. Our approach uses an adaptive-threshold technique to extract Web sessions from access logs. Then, we apply machine learning techniques to determine the parameters of the probabilistic model. The resulting classification is based on the maximum posterior probability of all classes given the available evidence. We apply our method to real Web-server logs and obtain results that demonstrate the robustness and effectiveness of probabilistic reasoning for crawler detection.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Networks - Volume 53, Issue 3, 27 February 2009, Pages 265–278
نویسندگان
, ,