Automatic generation of probabilistic relationships for improving schema matching

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
396572	670398	2011	17 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Semantic relationships - روابط معنایی Word Sense Disambiguation - یکنواختی کلمه معنی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

Automatic generation of probabilistic relationships for improving schema matching

چکیده انگلیسی

Schema matching is the problem of finding relationships among concepts across data sources that are heterogeneous in format and in structure. Starting from the “hidden meaning” associated with schema labels (i.e. class/attribute names), it is possible to discover lexical relationships among the elements of different schemata. In this work, we propose an automatic method aimed at discovering probabilistic lexical relationships in the environment of data integration “on the fly”. Our method is based on a probabilistic lexical annotation technique, which automatically associates one or more meanings with schema elements w.r.t. a thesaurus/lexical resource. However, the accuracy of automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns and abbreviations. We address this problem by including a method to perform schema label normalization which increases the number of comparable labels. From the annotated schemata, we derive the probabilistic lexical relationships to be collected in the Probabilistic Common Thesaurus. The method is applied within the MOMIS data integration system but can easily be generalized to other data integration systems.

Research Highlights
► Probabilistic lexical relationships among sources are discovered.
► A Probabilistic Word Sense Disambiguation algorithm annotate each schema element.
► We combine several WSD algorithms by using Dempster-Shafer's theory.
► A preprocess based on schema label normalization increases the annotable labels.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Systems - Volume 36, Issue 2, April 2011, Pages 192–208

نویسندگان

Laura Po, Serena Sorrentino,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Automatic generation of probabilistic relationships for improving schema matching

دسترسی سریع

ارتباط

English Website