Conditional random fields versus template-matching in MT phrasing tasks involving sparse training data

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
535282	870336	2015	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

machine translation - ترجمه ماشین Template-matching - تطبیق الگو

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش صفحه اول مقاله

Conditional random fields versus template-matching in MT phrasing tasks involving sparse training data

چکیده انگلیسی

• A comparison between template matching and conditional random fields is performed in the topic of phrasing.
• This study concerns machine translation applications where the amount of training data is sparse.
• Template matching provides a more effective phrasing scheme than probabilistic models such as CRF.

This communication focuses on comparing the template-matching technique to established probabilistic approaches – such as conditional random fields (CRF) – on a specific linguistic task, namely the phrasing of a sequence of words into phrases. This task represents a low-level parsing of the sequence into linguistically-motivated phrases. CRF represents the established method for implementing such a data-driven parser, while template-matching is a simpler method that is faster to train and operate. The two aforementioned techniques are compared here to determine the most suitable approach for extracting an accurate model.The specific application studied is related to a machine translation (MT) methodology (namely PRESEMT), though the comparison performed holds for other applications as well, for which only sparse training data are available. PRESEMT uses small parallel corpora to learn structural transformations from a source language (SL) to a target language (TL) and thus translate input text. This results in the availability of only sparse training data from which to train the parser. Experimental results indicate that for a limited-size training set, as is the case for the PRESEMT methodology, template-matching generates a superior phrasing model that in turn generates higher quality translations. This is confirmed by studying more than one source/target language pairs, for multiple independent testsets.

Graphical Abstractword/word/Figure optionsDownload high-quality image (69 K)Download as PowerPoint slide

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 53, 1 February 2015, Pages 44–52

نویسندگان

George Tambouratzis,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Conditional random fields versus template-matching in MT phrasing tasks involving sparse training data

دسترسی سریع

ارتباط

English Website