کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
515229 866972 2006 22 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Automatic extraction of bilingual word pairs using inductive chain learning in various languages
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Automatic extraction of bilingual word pairs using inductive chain learning in various languages
چکیده انگلیسی

In this paper, we propose a new learning method for extracting bilingual word pairs from parallel corpora in various languages. In cross-language information retrieval, the system must deal with various languages. Therefore, automatic extraction of bilingual word pairs from parallel corpora with various languages is important. However, previous works based on statistical methods are insufficient because of the sparse data problem. Our learning method automatically acquires rules, which are effective to solve the sparse data problem, only from parallel corpora without any prior preparation of a bilingual resource (e.g., a bilingual dictionary, a machine translation system). We call this learning method Inductive Chain Learning (ICL). Moreover, the system using ICL can extract bilingual word pairs even from bilingual sentence pairs for which the grammatical structures of the source language differ from the grammatical structures of the target language because the acquired rules have the information to cope with the different word orders of source language and target language in local parts of bilingual sentence pairs. Evaluation experiments demonstrated that the recalls of systems based on several statistical approaches were improved through the use of ICL.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 42, Issue 5, September 2006, Pages 1294–1315
نویسندگان
, , ,