کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
386335 660883 2011 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A hybrid framework to extract bilingual multiword expression from free text
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
A hybrid framework to extract bilingual multiword expression from free text
چکیده انگلیسی

Bilingual multiword expression extraction is always a significant problem in extracting meaning from free text. This involves analyzing large amounts of textual information. In this paper we propose a text mining approach to extract bilingual multiword expression. Both statistic and rule-based methods are employed into the system. There are two phases in the extraction process. In the first phase, lots of candidates are extracted from the corpus by statistic methods. The algorithm of multiple sequence alignment is sensitive to the flexible multiword. In the second phase, error-driven rules and patterns are extracted from corpus. For acquired high qualified instances, the manual work with active learning is also performed in sample selection. These trained rules are used to filter the candidates. Bilingual comparisons are used in a parallel corpus. Parts of bilingual syntactic patterns are obtained from the bilingual phrase dictionary. Some related experiments are designed for achieving the best performance because there are lots of parameters in this system. Experimental results showed our approach gains good performance.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 38, Issue 1, January 2011, Pages 314–320
نویسندگان
, , , ,