کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
535950 870412 2011 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Feature sub-set selection metrics for Arabic text classification
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Feature sub-set selection metrics for Arabic text classification
چکیده انگلیسی

Feature sub-set selection (FSS) is an important step for effective text classification (TC) systems. This paper presents an empirical comparison of seventeen traditional FSS metrics for TC tasks. The TC is restricted to support vector machine (SVM) classifier and only for Arabic articles. Evaluation used a corpus that consists of 7842 documents independently classified into ten categories. The experimental results are presented in terms of macro-averaging precision, macro-averaging recall and macro-averaging F1 measures. Results reveal that Chi-square and Fallout FSS metrics work best for Arabic TC tasks.


► Show the difficulty sources of Arabic TC.
► Show the need for feature selection.
► Comparison of seventeen traditional FSS metrics for Arabic TC tasks.
► The usage of IR performance metrics as FSS for Arabic TC tasks.
► Comparison of SVM, NB, kNN and Rochio classifiers for Arabic TC tasks.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 32, Issue 14, 15 October 2011, Pages 1922–1929
نویسندگان
,