KerMinSVM for imbalanced datasets with a case study on arabic comics classification

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4942811	1437419	2017	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Supervised classification - تحت نظارت طبقه بندی Support vector machines - ماشین بردار پشتیبانی Natural Language Processing - پردازش زبان‌های طبیعی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

KerMinSVM for imbalanced datasets with a case study on arabic comics classification

چکیده انگلیسی

Many studies have been performed to classify large-sized text documents using different classifiers, ranging from simple distance classifiers such as K-Nearest-Neighbor (KNN) to more advanced classifiers such as Support Vector Machines. Traditional approaches fail when a short text is encountered due to sparsity resulting from a limited number of words. Another common problem in text classification is class imbalance (CI). CI occurs when one class of the data contains most of the samples while the other class contains only a few. Standard classifiers, when applied to imbalanced data, result in high accuracy for the majority class and low accuracy for the minority one. We were motivated to propose a novel framework for classifying the content of Arabic comics; therefore, we propose KerMinSVM, a kernel extension of our previously proposed MinSVM coupled with a new dimensionality featuring a reduction scheme based on word root frequency ratios (WRFR). KerMinSVM was tested on multiple imbalanced benchmark datasets, and the results were verified using three measures: accuracy, F-measure, and statistical analysis. WRFR was applied to the manual construction of the Arabic comic text dataset to detect strong content in children's comic books. Test results revealed that our proposed framework outperformed most of the methods for imbalanced datasets and short text classification.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Engineering Applications of Artificial Intelligence - Volume 59, March 2017, Pages 159-169

نویسندگان

Ammar Nayal, Hadi Jomaa, Mariette Awad,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

KerMinSVM for imbalanced datasets with a case study on arabic comics classification

دسترسی سریع

ارتباط

English Website