کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
403605 677280 2014 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Effect of thesaurus size on schema matching quality
ترجمه فارسی عنوان
تأثیر اندازه اصطلاحنامه بر کیفیت طرح بندی
کلمات کلیدی
طرح بندی تطبیق، اصطلاحنامه، بازیابی اطلاعات، جستجوکردن، کارایی، شباهت متن، واژگان سازمانی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• We study the effect of thesaurus size on the outcome of schema matching.
• We utilize thesaurus to perform the mapping based on textual analysis.
• An enhanced search algorithm for applying thesaurus on texts was used.
• A new method of vector similarity was proposed and compared with cosine similarity.
• An increment in the average of similarity with distinctive values when using different thesauri was recorded.

Thesaurus is used in many Information Retrieval (IR) applications such as data integration, data warehousing, semantic query processing and schema matching. Schema matching or mapping is one of the most important basic steps in data integration. It is the process of identifying the semantic correspondence or equivalent between two or more schemas. Considering the fact of the existence of many thesauri for identical knowledge domain, the quality and the change in the results of schema matching when using different thesauri in specific knowledge field are not predictable. In this research, we studied the effect of thesaurus size on schema matching quality by conducting many experiments using different thesauri. In addition, a new method in calculating the similarity between vectors extracted from thesaurus database is proposed. The method is based on the ratio of individual shared elements to the elements in the compound set of the vectors. Moreover, we explained in details the efficient algorithm used in searching thesaurus database. After describing the experiments, results that show enhancement in the average of the similarity is presented. The completeness, effectiveness, and their harmonic mean measures were calculated to quantify the quality of matching. Experiments on two different thesauri show positive results with average Precision of 35% and a less value in the average of Recall. The effect of thesaurus size on the quality of matching was statically insignificant; however, other factors affecting the output and the exact value of change are still in the focus of our future study.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 71, November 2014, Pages 211–226
نویسندگان
, , , ,