A novel semantic information retrieval system based on a three-level domain model

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
461147	696562	2013	27 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

MaxEnt HMM WordNet Semantic information retrieval Ontology - هستی‌شناسی Word Sense Disambiguation - یکنواختی کلمه معنی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات

پیش نمایش صفحه اول مقاله

A novel semantic information retrieval system based on a three-level domain model

چکیده انگلیسی

This paper presents a methodology and a prototype for extracting and indexing knowledge from natural language documents. The underlying domain model relies on a conceptual level (described by means of a domain ontology), which represents the domain knowledge, and a lexical level (based on WordNet), which represents the domain vocabulary. A stochastic model (the ME-2L-HMM2, which mixes – in a novel way – HMM and maximum entropy models) stores the mapping between such levels, taking into account the linguistic context of words. Not only does such a context contain the surrounding words; it also contains morphologic and syntactic information extracted using natural language processing tools. The stochastic model is then used, during the document indexing phase, to disambiguate word meanings. The semantic information retrieval engine we developed supports simple keyword-based queries, as well as natural language-based queries. The engine is also able to extend the domain knowledge, discovering new and relevant concepts to add to the domain model. The validation tests indicate that the system is able to disambiguate and extract concepts with good accuracy. A comparison between our prototype and a classic search engine shows that the proposed approach is effective in providing better accuracy.

► We present a methodology for knowledge extraction and indexing from natural language documents.
► We define a model composed of a conceptual level (ontology), a lexical level (WordNet), and a mapping level (stochastic model).
► The stochastic model mixes HMM and maximum entropy models.
► Validation tests indicate that the system is able to disambiguate words and extract concepts with good accuracy.
► A comparison between our prototype and a classic search engine shows that the proposed approach is effective in providing better precision and recall.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 86, Issue 5, May 2013, Pages 1426–1452

نویسندگان

Licia Sbattella, Roberto Tedesco,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A novel semantic information retrieval system based on a three-level domain model

دسترسی سریع

ارتباط

English Website