Article ID Journal Published Year Pages File Type
438166 Theoretical Computer Science 2014 13 Pages PDF
Abstract

Finding heavy-elements (heavy-hitters) in streaming data is one of the central, and well-understood tasks. Despite the importance of this problem, when considering the sliding windows   model of streaming (where elements eventually expire) the problem of finding L2L2-heavy elements has remained completely open despite multiple papers and considerable success in finding L1L1-heavy elements.Since the L2L2-heavy element problem doesn't satisfy certain conditions, existing methods for sliding windows algorithms, such as smooth histograms or exponential histograms are not directly applicable to it. In this paper, we develop the first polylogarithmic-memory algorithm for finding L2L2-heavy elements in the sliding window model.Our technique allows us not only to find L2L2-heavy elements, but also heavy elements with respect to any LpLp with 02p>2 this task is impossible.We demonstrate a broader applicability of our method on two additional examples: we show how to obtain a sliding window approximation of the similarity of two streams, and of the fraction of elements that appear exactly a specified number of times within the window (the α-rarity problem). In these two illustrative examples of our method, we replace the current expected memory bounds with worst case bounds.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,