Fast multi-label feature selection based on information-theoretic feature ranking

Article ID	Journal	Published Year	Pages	File Type
530506	Pattern Recognition	2015	11 Pages	PDF

Abstract

•A score function from mutual information between a feature and labels was derived.•Unnecessary computations from the score function were discarded.•A strategy to identify important labels from sparse label set was proposed.•The computational cost of each component was analyzed theoretically.

Multi-label feature selection involves selecting important features from multi-label data sets. This can be achieved by ranking features based on their importance and then selecting the top-ranked features. Many multi-label feature selection methods for finding a feature subset that can improve multi-label learning accuracy have been proposed. In contrast, computationally efficient multi-label feature selection methods have not been studied extensively. In this study, we propose a fast multi-label feature selection method based on information-theoretic feature ranking. Experimental results demonstrate that the proposed method generates a feature subset significantly faster than several other multi-label feature selection methods for large multi-label data sets.

Keywords

Entropy Mutual information Multi-label feature selection