Article ID Journal Published Year Pages File Type
4961244 Procedia Computer Science 2017 7 Pages PDF
Abstract

As the Vrije Universiteit Brussel switched from an in-house built CRIS to Pure, a large number of data quality issues were discovered. In order to solve these, a large-scale data quality assessment and improvement program was started. The assessment sought to find data quality issues and prioritize cleaning tasks along different dimensions, such as reusability and complexity, while taking into account compliance and stakeholder happiness. Moreover, in doing these assessments, an attempt was made to isolate relatively easy to clean parts of the data in order to make them more feasible for people with less domain-knowledge. Finally, some of these data quality improvement operations turned out to be straightforward enough to fully automate them.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, ,