کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
474255 698856 2007 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A heuristic approach for the continuous error localization problem in data cleaning
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
A heuristic approach for the continuous error localization problem in data cleaning
چکیده انگلیسی

The Error Localization Problem concerns finding the minimum number of fields in a record such that by modifying the values in these fields the new record satisfies a given set of rules. This problem is of great interest to statistical agencies in as far as cleaning microdata is concerned. It has been shown to be NPNP-hard, and exact methods in literature only succeed in solving small instances. This article presents a new heuristic algorithm based on a descending search approach to obtain near-optimal solutions. Some procedures of this descending search make use of Farkas’ Lemma in Linear Programming to drastically reduce the search space in one of the proposed neighborhoods. Computational experience on randomly generated instances shows that the approach can deal with instances of up to 1000 fields and 400 edits.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Operations Research - Volume 34, Issue 8, August 2007, Pages 2370–2383
نویسندگان
, ,