کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6888562 697420 2015 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Computing discounted multidimensional hierarchical aggregates using modified Misra Gries algorithm
ترجمه فارسی عنوان
با استفاده از الگوریتم اصلاح شده میسرا گریس محاسبات سلسله مراتبی چند بعدی را تخفیف داده است
کلمات کلیدی
قهرمانان سلسله مراتبی، خلاصه سازی ترافیک شبکه، گیرنده های سنگین جریان داده ها،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
چکیده انگلیسی
Finding the “Top k” list or heavy hitters is an important function in many computing applications, including database joins, data warehousing (e.g., OLAP), web caching and hits, network usage monitoring, and detecting DDoS attacks. While most applications work on traditional “flat” data, many domains contain data that has attributes which can take values from different hierarchies e.g., time and geographic location, source and destination IP addresses, etc. In addition, hierarchical heavy hitters generated from hierarchical attributes offer insightful and a generalized view of the data. When data arrives in a stream, infrequent data elements need to be deleted due to space constraints. However, unlike traditional heavy hitters which do not consider deleted elements, in a hierarchical heavy hitter some of these deleted elements could form a heavy hitter at a higher level of the hierarchy. Furthermore, unlike traditional heavy hitters, hierarchical heavy hitters have ancestor-descendant relationship which requires the count of descendant heavy hitters to be discounted at the higher levels of the hierarchy. This is particularly challenging in a streaming environment because all the incoming data items cannot be stored or revisited. This problem is generally addressed by accepting an error constraint ϵ on the precision of the count of hierarchical heavy hitters, since accurate count cannot be guaranteed under the constrained memory requirement posed by the streaming environment. In this work, we propose a new streaming ϵ-approximation algorithm (HHH-MG) for computing Hierarchical Heavy Hitters based on a modified Misra Gries heavy hitter algorithm. The proposed algorithm guarantees ϵ-approximation precision with improved worst-case time and space bounds compared to previous algorithms. It requires overall O(ηϵ) space, and O(η) updates per element of the data, where η is a small constant. We provide theoretical proofs for the space and time requirements. We have also experimentally compared the proposed algorithm with benchmark techniques. Experimental results demonstrate that the proposed algorithm requires fewer updates per element of data, and on average requires less memory. For the experimental validation, we have used both synthetic data derived from an open source generator, and real benchmark datasets from an international Internet Service Provider.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Performance Evaluation - Volume 91, September 2015, Pages 170-186
نویسندگان
, , ,