کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1110947 1488361 2015 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Creation of a High-quality, Register-diversified Parallel (English-Spanish) Corpus for Linguistic and Computational Investigations
موضوعات مرتبط
علوم انسانی و اجتماعی علوم انسانی و هنر هنر و علوم انسانی (عمومی)
پیش نمایش صفحه اول مقاله
Creation of a High-quality, Register-diversified Parallel (English-Spanish) Corpus for Linguistic and Computational Investigations
چکیده انگلیسی

This paper outlines current work on the construction of a high-quality, richly-annotated and register-diversified parallel corpus for the English-Spanish language pair, as currently carried out within the framework of the MULTINOT project. The corpus consists of original and translated texts in both directions and is designed as a multifunctional resource to be used in a number of disciplines such as corpus-based contrastive linguistic and translation studies, machine translation, computer-assisted translation, computer-assisted language learning and terminology extraction. The paper describes the structure of the corpus –which includes four subcorpora: English originals (EO) and Spanish originals (SO), English translations (Etrans) and Spanish translations (Strans)-, the registers selected for inclusion in the corpus, and the methodology used to guarantee the quality of the processing steps to enrich the corpus with linguistic information at different levels.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia - Social and Behavioral Sciences - Volume 198, 24 July 2015, Pages 249-256