کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6948353 | 1451034 | 2018 | 41 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Assessing data quality - A probability-based metric for semantic consistency
ترجمه فارسی عنوان
ارزیابی کیفیت داده ها - یک شاخص مبتنی بر احتمال برای قوام معنایی
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
کیفیت داده، ارزیابی کیفیت داده ها، متریک کیفیت داده، یکپارچگی داده ها،
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
سیستم های اطلاعاتی
چکیده انگلیسی
We present a probability-based metric for semantic consistency using a set of uncertain rules. As opposed to existing metrics for semantic consistency, our metric allows to consider rules that are expected to be fulfilled with specific probabilities. The resulting metric values represent the probability that the assessed dataset is free of internal contradictions with regard to the uncertain rules and thus have a clear interpretation. The theoretical basis for determining the metric values are statistical tests and the concept of the p-value, allowing the interpretation of the metric value as a probability. We demonstrate the practical applicability and effectiveness of the metric in a real-world setting by analyzing a customer dataset of an insurance company. Here, the metric was applied to identify semantic consistency problems in the data and to support decision-making, for instance, when offering individual products to customers.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Decision Support Systems - Volume 110, June 2018, Pages 95-106
Journal: Decision Support Systems - Volume 110, June 2018, Pages 95-106
نویسندگان
Bernd Heinrich, Mathias Klier, Alexander Schiller, Gerit Wagner,