Article ID Journal Published Year Pages File Type
382639 Expert Systems with Applications 2016 11 Pages PDF
Abstract

•An improved global feature selection scheme is proposed for text classification.•It is an ensemble method combining the power of two filter-based methods.•The new method combines a global and a one-sided local feature selection method.•By incorporating these methods, the feature set represents classes almost equally.•This method outperforms the individual performances of feature selection methods.

Feature selection is known as a good solution to the high dimensionality of the feature space and mostly preferred feature selection methods for text classification are filter-based ones. In a common filter-based feature selection scheme, unique scores are assigned to features depending on their discriminative power and these features are sorted in descending order according to the scores. Then, the last step is to add top-N features to the feature set where N is generally an empirically determined number. In this paper, an improved global feature selection scheme (IGFSS) where the last step in a common feature selection scheme is modified in order to obtain a more representative feature set is proposed. Although feature set constructed by a common feature selection scheme successfully represents some of the classes, a number of classes may not be even represented. Consequently, IGFSS aims to improve the classification performance of global feature selection methods by creating a feature set representing all classes almost equally. For this purpose, a local feature selection method is used in IGFSS to label features according to their discriminative power on classes and these labels are used while producing the feature sets. Experimental results on well-known benchmark datasets with various classifiers indicate that IGFSS improves the performance of classification in terms of two widely-known metrics namely Micro-F1 and Macro-F1.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
,