دانلود رایگان مقاله: نمونه برداری کارآمد از جریان داده های توریستی غیر دقیق

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
437687	690174	2015	12 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Efficient sampling of non-strict turnstile data streams

ترجمه فارسی عنوان

نمونه برداری کارآمد از جریان داده های توریستی غیر دقیق

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

نمونه برداری، توزیع معکوس، جریان داده ها

Data streams - جریان داده ها Sampling - نمونه‌برداری

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات

پیش نمایش مقاله

نمونه برداری کارآمد از جریان داده های توریستی غیر دقیق

چکیده انگلیسی

We study the problem of generating a large sample from a data stream SS of elements (i,v)(i,v), where i is a positive integer key, v is an integer equal to the count of key i , and the sample consists of pairs (i,Ci)(i,Ci) for Ci=∑(i,v)∈SvCi=∑(i,v)∈Sv. We consider strict turnstile streams and general non-strict turnstile streams, in which CiCi may be negative. Our sample is useful for approximating both forward and inverse distribution statistics, within an additive error ϵ and provable success probability 1−δ1−δ.Our sampling method improves by an order of magnitude the known processing time of each stream element, a crucial factor in data stream applications, thereby providing a feasible solution to the sampling problem. For example, for a sample of size O(ϵ−2log⁡(1/δ))O(ϵ−2log⁡(1/δ)) in non-strict streams, our solution requires O((log⁡log⁡(1/ϵ))2+(log⁡log⁡(1/δ))2)O((log⁡log⁡(1/ϵ))2+(log⁡log⁡(1/δ))2) operations per stream element, whereas the best previous solution requires O(ϵ−2log2⁡(1/δ))O(ϵ−2log2⁡(1/δ)) evaluations of a fully independent hash function per element.We achieve this improvement by constructing an efficient K-elements recovery structure from which K elements can be extracted with probability 1−δ1−δ. Our structure enables our sampling algorithm to run on distributed systems and extract statistics on the difference between streams.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Theoretical Computer Science - Volume 590, 26 July 2015, Pages 106–117

نویسندگان

Neta Barkay, Ely Porat, Bar Shalem,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : نمونه برداری کارآمد از جریان داده های توریستی غیر دقیق

دسترسی سریع

ارتباط

English Website