کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
382639 660775 2016 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An improved global feature selection scheme for text classification
ترجمه فارسی عنوان
طرح بهبود یافته ویژگی های جهانی برای طبقه بندی متن
کلمات کلیدی
انتخاب جهانی ویژگی؛ فیلتر کردن طبقه بندی متن؛ شناسایی الگو
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• An improved global feature selection scheme is proposed for text classification.
• It is an ensemble method combining the power of two filter-based methods.
• The new method combines a global and a one-sided local feature selection method.
• By incorporating these methods, the feature set represents classes almost equally.
• This method outperforms the individual performances of feature selection methods.

Feature selection is known as a good solution to the high dimensionality of the feature space and mostly preferred feature selection methods for text classification are filter-based ones. In a common filter-based feature selection scheme, unique scores are assigned to features depending on their discriminative power and these features are sorted in descending order according to the scores. Then, the last step is to add top-N features to the feature set where N is generally an empirically determined number. In this paper, an improved global feature selection scheme (IGFSS) where the last step in a common feature selection scheme is modified in order to obtain a more representative feature set is proposed. Although feature set constructed by a common feature selection scheme successfully represents some of the classes, a number of classes may not be even represented. Consequently, IGFSS aims to improve the classification performance of global feature selection methods by creating a feature set representing all classes almost equally. For this purpose, a local feature selection method is used in IGFSS to label features according to their discriminative power on classes and these labels are used while producing the feature sets. Experimental results on well-known benchmark datasets with various classifiers indicate that IGFSS improves the performance of classification in terms of two widely-known metrics namely Micro-F1 and Macro-F1.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 43, January 2016, Pages 82–92
نویسندگان
,