کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
392635 665145 2016 21 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Distributed error estimation of functional dependency
ترجمه فارسی عنوان
تخمین خطای توزیع وابستگی عملکردی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Measuring or estimating the number of errors in (i.e., violations to) a functional dependency (FD) offers valuable information about data semantics and quality. Most existing work focuses on FD error estimation in a centralized environment, where data are stored only in one site and the goal is to optimize the time and space complexities of the estimation algorithms. The distributed FD error estimation problem, in which the data can reside in multiple physically distributed sites, has never been studied in depth and is the subject of this work. In this work, we study a version of the distributed FD error estimation problem where a coordinator site communicates with multiple remote sites for arriving at such estimations, and the goal is to minimize this communication cost. We study two types of queries—that are dual to each other in semantics—for such estimations: one tries to maximize the accuracies of FD error estimations under fixed communication costs, and the other to minimize the communication costs needed to meet certain accuracy requirements. In our framework, each remote site maintains a concise synopsis data structure obtained by scanning its local data once, and the coordinator site receives and processes all such data structures to arrive at an estimate of the FD error. Our solution extends from the case of two remote sites to that of multiple remote sites. We demonstrate the efficacy of our proposed techniques via rigorous analysis and extensive experiments.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 345, 1 June 2016, Pages 156–176
نویسندگان
, , , , ,