Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem

Article ID	Journal	Published Year	Pages	File Type
839050	Nonlinear Analysis: Real World Applications	2006	28 Pages	PDF

Abstract

Most of the real-world data that are analyzed using nonlinear classification techniques are imbalanced in terms of the proportion of examples available for each class. This problem of imbalanced class distributions can lead the algorithms to learn overly complex models that overfit the data and have little relevance. Our study analyzes different classification algorithms that were employed to predict the creditworthiness of a bank's customers based on checking account information. A series of experiments were conducted to test the different techniques. The objective is to determine a range of credit scores that could be implemented by a manager for risk management. As a result, by realizing the concept of classification with equal quantities, the implicit knowledge can be discovered successfully. Subsequently, a strategy of data cleaning for handling such a real case with imbalanced distribution data is then proposed.

Keywords

Back-propagation Data mining Class imbalance Credit scoring