کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6902137 1446498 2017 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Arabic Social Media Analysis and Translation
ترجمه فارسی عنوان
تحلیل و ترجمه رسانه های عربی
کلمات کلیدی
توییتر، توییت، عربی شناسایی گویش، ترجمه ماشین
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی
Twitter, is considered as one of the famous social networking platform. It has become a very valuable information source for many Natural Language Processing (NLP) applications. Some strategies and linguistic pipelines were developed for analyzing English tweets but Arabic social media analysis is still an active research area. In this research paper, we focus on the task of pre-processing Arabic tweets, which can be regarded as a first step for any NLP application. We follow up with a statistical machine translation for Arabic tweets into English, where we explain the normalization process for both Arabic and English tweets. Moreover, to overcome the obstacle of unavailability of Arabic-English parallel corpora in the social media context, we used the UN corpus, a more general corpus in (Modern Standard Arabic and English). Then, we applied adapting strategies for the tweet's contents like using an out-of-domain and/or in-domain language model. Our conducted experiments showed that applying a good lexical normalization on both languages and combining in-domain and out-of-domain data for the language model improves the Bleu score with 4pt., over the baseline.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 117, 2017, Pages 298-303
نویسندگان
, , ,