Article ID Journal Published Year Pages File Type
8902311 Journal of Computational and Applied Mathematics 2018 25 Pages PDF
Abstract
With the financial crisis happened in 2007, massive credit risks are exposed to the banking sectors. So credit scoring has attracted more and more attention. Bank owns a lot of customer data. By using those data, credit scoring model can judge the applicants' credit risk accurately. But those data are often high dimensional, and have some irrelevant features. Those irrelevant features will affect classifiers accuracy. Therefore, feature selection is an important topic. This paper proposes a two-phase hybrid approach based on filter approach and multiple population genetic algorithm-HMPGA. In phase 1, it introduces the idea of wrapper approach into three filter approaches to acquire some important prior information for initial populations setting of MPGA. In phase 2, it takes advantage of MPGA's characteristics of global optimization and quick convergence to find optimal feature subset. This paper uses two real credit scoring datasets of UCI databases to compare HMPGA, MPGA and GA. It verifies that the accuracies of feature subsets acquired from HMPGA, MPGA and GA are superior to three filter approaches. Meanwhile, nonparametric Wilcoxon signed rank test is held to confirm that HMPGA is better than MPGA and GA. HMPGA not only can be applied to feature selection of credit scoring, but also can be applied to more fields of data mining.
Related Topics
Physical Sciences and Engineering Mathematics Applied Mathematics
Authors
, , , ,