Article ID Journal Published Year Pages File Type
451315 Computer Networks 2010 14 Pages PDF
Abstract

Flow level information is important for many applications in network measurement and analysis. In this work, we tackle the “Top Spreaders” and “Top Scanners” problems, where hosts that are spreading the largest numbers of flows, especially small flows, must be efficiently and accurately identified. The identification of these top users can be very helpful in network management, traffic engineering, application behavior analysis, and anomaly detection.We propose novel streaming algorithms and a “Filter-Tracker-Digester” framework to catch the top spreaders and scanners online. Our framework combines sampling and streaming algorithms, as well as deterministic and randomized algorithms, in such a way that they can effectively help each other to improve accuracy while reducing memory usage and processing time. To our knowledge, we are the first to tackle the “Top Scanners” problem in a streaming way. We address several challenges, namely: traffic scale, skewness, speed, memory usage, and result accuracy. The performance bounds of our algorithms are derived analytically, and are also evaluated by both real and synthetic traces, where we show our algorithm can achieve accuracy and speed of at least an order of magnitude higher than existing approaches.

Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, , ,