کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
420733 683972 2009 22 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Order statistics and estimating cardinalities of massive data sets
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Order statistics and estimating cardinalities of massive data sets
چکیده انگلیسی

A new class of algorithms to estimate the cardinality of very large multisets using constant memory and doing only one pass on the data is introduced here. It is based on order statistics rather than on bit patterns in binary representations of numbers. Three families of estimators are analyzed. They attain a standard error of 1M using MM units of storage, which places them in the same class as the best known algorithms so far. The algorithms have a very simple internal loop, which gives them an advantage in terms of processing speed. For instance, a memory of only 12 kB and only few seconds are sufficient to process a multiset with several million elements and to build an estimate with accuracy of order 2 percent. The algorithms are validated both by mathematical analysis and by experimentations on real internet traffic.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Discrete Applied Mathematics - Volume 157, Issue 2, 28 January 2009, Pages 406–427
نویسندگان
,