کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
458455 696159 2013 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Graph-based reference table construction to facilitate entity matching
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Graph-based reference table construction to facilitate entity matching
چکیده انگلیسی

Entity matching plays a crucial role in information integration among heterogeneous data sources, and numerous solutions have been developed. Entity resolution based on reference table has the benefits of high efficiency and being easy to update. In such kind of methods, the reference table is important for effective entity matching. In this paper, we focus on the construction of effective reference table by relying on co-occurring relationship between tokens to identify suitable entity names. To achieve high efficiency and accuracy, we first model data set as graph, and then cluster the vertices in the graph in two stages. Based on the connectivity between vertices, we also mine synonyms and get the expansive reference table. We develop an iterative system and conduct an experimental study using real data. Experimental results show that the method in this paper achieves both high accuracy and efficiency.


► We models reference table generation problem as a graph with affinity property.
► We propose a hierarchy clustering in entity matching to distinguish tokens.
► We develop a graph-based method of identifying synonyms to prove the accuracy of clustering.
► We develop pruning and partition techniques to achieve high performance.
► We propose a novel method of token weight decision.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 86, Issue 6, June 2013, Pages 1679–1688
نویسندگان
, , , ,