کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
483902 702868 2014 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Naïve Bayes classifiers for authorship attribution of Arabic texts
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Naïve Bayes classifiers for authorship attribution of Arabic texts
چکیده انگلیسی

Authorship attribution is the process of assigning an author to an anonymous text based on writing characteristics. Several authorship attribution methods were developed for natural languages, such as English, Chinese and Dutch. However, the number of related works for Arabic is limited. Naïve Bayes classifiers have been widely used for various natural language processing tasks. However, there is generally no mention of the event model used, which can have a considerable impact on the performance of the classifier. To the best of our knowledge, naïve Bayes classifiers have not yet been considered for authorship attribution in Arabic. Therefore, we propose to study their use for this problem, taking into account different event models, namely, simple naïve Bayes (NB), multinomial naïve Bayes (MNB), multi-variant Bernoulli naïve Bayes (MBNB) and multi-variant Poisson naïve Bayes (MPNB). We evaluate these models’ performances on a large Arabic dataset extracted from books of 10 different authors and compare them with other existing methods. The experimental results show that MBNB provides the best results and could attribute the author of a text with an accuracy of 97.43%. Comparison results with related methods indicate that MBNB and MNB are appropriate for authorship attribution.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of King Saud University - Computer and Information Sciences - Volume 26, Issue 4, December 2014, Pages 473–484
نویسندگان
, ,