Article ID Journal Published Year Pages File Type
378794 Data & Knowledge Engineering 2014 18 Pages PDF
Abstract

A challenge for Linked Data is to link instances from different data sources that denote the same real-world object. Millions of high-quality owl:sameAs linkages have been generated, but potential ones are still considerable. Traditional similarity-based methods to this data linkage problem do not scale well since they exhaustively compare every pair of instances. In this paper, we propose an automatic approach to data linkage generation for Linked Data. Specifically, a highly-accurate training set is automatically generated based on equivalence reasoning and common prefix blocking. The contexts of the instances in the training set, after extracting, are pairwise matched in order to learn discriminative property pairs supporting linkage discovery. For a particular class pair and a pay-level-domain pair, the discriminability of each property pair is measured, and a few property pairs with high discriminability are aggregated in order to be reused in the future to link instances between the same classes and domains. The experimental results show that our approach achieves good accuracy against some complex methods in two OAEI tests and the BTC2011 dataset.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,