کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4960372 1364896 2017 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Towards a standard Part of Speech tagset for the Arabic language
ترجمه فارسی عنوان
به بخش استاندارد تگت گفتار برای زبان عربی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی

Part of Speech (PoS) tagging is still not very well investigated with respect to the Arabic language. Determining the PoS tags of a word in a particular context is difficult, primarily because there is no use of diacritics in most of contemporary texts. Consequently, the same word may be spelled in different ways. Further, detecting the difference between Arabic derivatives represents a very challenging issue for the majority of PoS taggers. Hence, the task of tagging the correct PoS tags requires advanced processing and the use of considerable resources. This study aims to design detailed hierarchical levels of the Arabic tagset categories and their relationships. These hierarchical levels allow easier expansion when required and produce more accurate and precise results. They are based on a comparative study and important references in Arabic grammar; they are also validated by experts in this field. In addition, the proposed tagset is implemented in a PoS tagger and tested via various experiments. We believe that our study makes a significant contribution to the literature because this work is an advancement in the direction of achieving a standard, rich, and comprehensive tagset for Arabic.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of King Saud University - Computer and Information Sciences - Volume 29, Issue 2, April 2017, Pages 171-178
نویسندگان
, , ,