دانلود رایگان مقاله: انتخاب ویژگی براساس یک معیار تفاوت عادی برای طبقه بندی متن

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4966453	1365122	2017	17 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Feature selection based on a normalized difference measure for text classification

ترجمه فارسی عنوان

انتخاب ویژگی براساس یک معیار تفاوت عادی برای طبقه بندی متن

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

طبقه بندی متن، انتخاب ویژگی، اندازه گیری دقیق، فرکانس سند،

Document frequency Feature selection - انتخاب ویژگی Accuracy measure - اندازه گیری دقت Text classification - طبقه بندی متن

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش مقاله

انتخاب ویژگی براساس یک معیار تفاوت عادی برای طبقه بندی متن

چکیده انگلیسی

The goal of feature selection in text classification is to choose highly distinguishing features for improving the performance of a classifier. The well-known text classification feature selection metric named balanced accuracy measure (ACC2) (Forman, 2003) evaluates a term by taking the difference of its document frequency in the positive class (also known as true positives) and its document frequency in the negative class (also known as false positives). This however results in assigning equal ranks to terms having equal difference, ignoring their relative document frequencies in the classes. In this paper we propose a new feature ranking (FR) metric, called normalized difference measure (NDM), which takes into account the relative document frequencies. The performance of NDM is investigated against seven well known feature ranking metrics including odds ratio (OR), chi squared (CHI), information gain (IG), distinguishing feature selector (DFS), gini index (GINI) ,balanced accuracy measure (ACC2) and Poisson ratio (POIS) on seven datasets namely WebACE(WAP,K1a,K1b), Reuters (RE0, RE1),spam email dataset and 20 newsgroups using the multinomial naive Bayes (MNB) and supports vector machines (SVM) classifiers. Our results show that the NDM metric outperforms the seven metrics in 66% cases in terms of macro-F1 measure and in 51% cases in terms of micro F1 measure in our experimental trials on these datasets.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 53, Issue 2, March 2017, Pages 473-489

نویسندگان

Abdur Rehman, Kashif Javed, Haroon A. Babri,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : انتخاب ویژگی براساس یک معیار تفاوت عادی برای طبقه بندی متن

دسترسی سریع

ارتباط

English Website