دانلود رایگان مقاله: در مورد امکان شبه ترجمه N-گرم شخصیت برای انجام وظایف بازیابی متقابل اطلاعات زبان

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
558225	1451691	2016	29 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

On the feasibility of character n-grams pseudo-translation for Cross-Language Information Retrieval tasks

ترجمه فارسی عنوان

در مورد امکان شبه ترجمه N-گرم شخصیت برای انجام وظایف بازیابی متقابل اطلاعات زبان

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

بازیابی اطلاعات زبان متقابل ؛ شخصیت N-گرم؛ الگوریتم های ترازبندی برای ماشین ترجمه

Cross-Language Information Retrieval - بازیابی اطلاعات متقابل زبان

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

در مورد امکان شبه ترجمه N-گرم شخصیت برای انجام وظایف بازیابی متقابل اطلاعات زبان

چکیده انگلیسی

• We analyze the use of character n-grams both as indexing and translation units for CLIR tasks.
• We study their effective application and consistency across languages.
• We use an algorithm of our own for parallel text alignment at the subword level.
• Tests were performed for seven languages, with English as the target language.
• Results confirm their feasibility and consistency, that their validity is not tied to a given implementation, and a remarkable robustness.

The field of Cross-Language Information Retrieval relates techniques close to both the Machine Translation and Information Retrieval fields, although in a context involving characteristics of its own. The present study looks to widen our knowledge about the effectiveness and applicability to that field of non-classical translation mechanisms that work at character n-gram level. For the purpose of this study, an n-gram based system of this type has been developed. This system requires only a bilingual machine-readable dictionary of n-grams, automatically generated from parallel corpora, which serves to translate queries previously n-grammed in the source language. n-Gramming is then used as an approximate string matching technique to perform monolingual text retrieval on the set of n-grammed documents in the target language.The tests for this work have been performed on CLEF collections for seven European languages, taking English as the target language. After an initial tuning phase in order to analyze the most effective way for its application, the results obtained, close to the upper baseline, not only confirm the consistency across languages of this kind of character n-gram based approaches, but also constitute a further proof of their validity and applicability, these not being tied to a given implementation.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 36, March 2016, Pages 136–164

نویسندگان

Jesús Vilares, Manuel Vilares, Miguel A. Alonso, Michael P. Oakes,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : در مورد امکان شبه ترجمه N-گرم شخصیت برای انجام وظایف بازیابی متقابل اطلاعات زبان

دسترسی سریع

ارتباط

English Website