Article ID Journal Published Year Pages File Type
392134 Information Sciences 2015 26 Pages PDF
Abstract

Preserving data quality is an important issue in data collection management. One of the crucial issues hereby is the detection of duplicate objects (called coreferent objects) which describe the same entity, but in different ways. In this paper we present a method for detecting coreferent objects in metadata, in particular in XML schemas. Our approach consists in comparing the paths from a root element to a given element in the schema. Each path precisely defines the context and location of a specific element in the schema. Path matching is based on the comparison of the different steps of which paths are composed. The uncertainty about the matching of steps is expressed with possibilistic truth values and aggregated using the Sugeno integral. The discovered coreference of paths can help for establishing a mapping between two different XML schemas. In other words, a novel approach for schema matching problem based on paths comparison only is proposed.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , ,