کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6874502 1441163 2017 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Modeling a non-stationary bots' arrival process at an e-commerce Web site
ترجمه فارسی عنوان
مدل سازی فرایند ورود غیرمنتظره رباتها در یک وب سایت تجارت الکترونیک
کلمات کلیدی
00-01، 99-00، تجزیه و تحلیل ترافیک وب و مدل سازی، مشخصات ترافیک وب، تجزیه و تحلیل فایل ورودی، وب سرور، ربات اینترنتی، ربات وب، مدل سازی و شبیه سازی، تجزیه و تحلیل رگرسیون، توزیع سنگین وحشی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی
The paper concerns the issue of modeling and generating a representative Web workload for Web server performance evaluation through simulation experiments. Web traffic analysis has been done from two decades, usually based on Web server log data. However, while the character of the overall Web traffic has been extensively studied and modeled, relatively few studies have been devoted to the analysis of Web traffic generated by Internet robots (Web bots). Moreover, the overwhelming majority of studies concern the traffic on non e-commerce websites. In this paper we address the problem of modeling a realistic arrival process of bots' requests on an e-commerce Web server. Based on real log data for an online store, sessions generated by bots were reconstructed and their key features were analyzed, including the interarrival time of bot sessions, the number of HTTP requests per session, and the interarrival time of requests in session. To deal with the problem of non-stationarity of the Web traffic, chunks associated with times of day were distinguished based on the intensity of bot sessions' arrivals and then features of sessions in individual time chunks were analyzed separately. Using regression analysis, a mathematical model of the bots' traffic features was developed and implemented in a bot traffic generator. Our findings confirm the existence of a heavy-tail in bot traffic features' distributions. The bots' session interarrival times and request interarrival times are best modeled by a Weibull and a sigmoid distributions, respectively, while the model proposed for the numbers of requests per bot session is based on a hybrid function being a combination of one exponential and two normal distribution functions. The suitable fit of the model was confirmed by the high correlation of the real and model data. Furthermore, a visual inspection of the simulation results showed that the estimated values represent distributions close to those of the empirical data.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Computational Science - Volume 22, September 2017, Pages 198-208
نویسندگان
, ,