کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6854396 1437428 2016 22 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Derivation of “is a” taxonomy from Wikipedia Category Graph
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Derivation of “is a” taxonomy from Wikipedia Category Graph
چکیده انگلیسی
Knowledge acquisition still represents one of the main challenging obstacles to designing intelligent systems exhibiting human-level performance in complex intelligent tasks. The recent developments in crowdsourcing technologies have opened new promising opportunities to overcome this problem by exploiting large amounts of machine readable knowledge to perform tasks requiring human intelligence. Wikipedia is a case of this research trend, being the largest collaborative and multilingual resource and linguistic knowledge that contains unstructured and semi-structured information. In this paper, we propose an approach for deriving “is a” taxonomy from the Wikipedia Categories Graph (WCG), which is an open collaborative resource. After building and filtering the WCG from a Wikipedia dump, the process would mainly consist in the exploitation of the “BY” tag and the sharing of plural headers. These methods provide a graph formed by a set of non-connected sub-graphs. Therefore, we propose a process for linking them to finally obtain an “is a” taxonomy with only one root and modeled as a direct acyclic graph (DAG). In this work, specific DAG handling algorithms are used, including an algorithm for a DAG into sub-DAGs and another for merging two DAGs. The obtained taxonomy is assessed using semantic similarity measures, which consist in quantifying the likeness between two concepts or words. Therefore, we exploit a set of well-known benchmarks to compare the results obtained via the generated taxonomy to those achieved with WordNet, a resource created and maintained by domain experts. The experimental results revealed good correlations between computed values and human judgments. Compared to WordNet, the derived taxonomy was also noted to lead to an enhanced coverage capacity.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Engineering Applications of Artificial Intelligence - Volume 50, April 2016, Pages 265-286
نویسندگان
, , ,