کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6902092 | 1446498 | 2017 | 6 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Automatic minimal diacritization of Arabic texts
ترجمه فارسی عنوان
ضمایم رسمی به صورت خودکار از متون عربی
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
علوم کامپیوتر (عمومی)
چکیده انگلیسی
Modern Standard Arabic (MSA) is typically written without short vowels, which helps in clarifying the sense and meaning of the word. The short vowels are omitted since experienced Arabic readers can infer the meaning through the context. But there are cases where even the native Arabic speakers cannot resolve. The process of restoring the diacritical marks (short vowels) is known as diacritization. Most of the developed algorithms for diacritization fully restores all the markings, many of which are trivial or unnecessary. In this paper, we present a system that restores the diacritical markings where it is mostly needed, resolving the ambiguity. This is a more challenging problem than fully restoring all the diacritics. The system combines morphological analyzers and context similarities. The goal of the morphological analyzers is to generate all word candidates for the diacritics, and the model eliminates word ambiguity through a statistical approach and context similarities. Out of 80 paragraphs our system resolved 57 cases.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 117, 2017, Pages 169-174
Journal: Procedia Computer Science - Volume 117, 2017, Pages 169-174
نویسندگان
Rehab Alnefaie, Aqil M. Azmi,