کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6941391 | 870256 | 2014 | 10 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
A graph-based multi-level linguistic representation for document understanding
ترجمه فارسی عنوان
یک نمایه زبانشناختی چند سطحی مبتنی بر گراف برای درک سند
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
استخراج متن، نمایندگی متن، نمایندگی بر اساس نمودار،
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی
Document understanding goal requires discovery of meaningful patterns in text, which in turn requires analyzing documents and extracting information useful for a purpose. The documents to be analyzed are expected to be represented in some way. It is true that different representations of the same piece of text might have different information extraction outcomes. Therefore, it is very important to propose a reliable text representation schema that may incorporate as many features as possible, and at the same time provides use of efficient document understanding algorithms. In this paper, we propose a graph-based representation of textual documents that employs different levels of formal representation of natural language. This schema takes into account different linguistic levels, such as lexical, morphological, syntactical and semantics. The representation schema proposed is accompanied with a proposal for a technique which allows to extract useful text patterns based on the idea of minimum paths in the graph. The efficiency of the representation schema proposed has been tested in one case of study (Question-Answering for machine Reading Evaluation - QA4MRE), and the results of experiments carried in it, are described. The results obtained show that the proposed graph-based multi-level linguistic representation schema may be successfully used in the broader framework of document understanding.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 41, 1 May 2014, Pages 93-102
Journal: Pattern Recognition Letters - Volume 41, 1 May 2014, Pages 93-102
نویسندگان
David Pinto, Helena Gómez-Adorno, Darnes Vilariño, Vivek Kumar Singh,