Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
725089 | The Journal of China Universities of Posts and Telecommunications | 2012 | 7 Pages |
To study the use of swarm intelligence (SI) to integrate feature selection and instance selection for pre-process data, and to boost the prediction accuracy of classifier on the reduced data in data mining (DM), this article puts forward a novel hybrid data pre-processing method based on SI and decision tree. The method uses the binary particle swarm optimization (PSO) as a subset generator to control the particles searching the optimal feature subset and instance subset, and employs a decision tree as a wrapper classifier to evaluate feature subsets and instance subsets at the same time. In the method, the PSO algorithm takes the multi-selection problem as a combinational optimization problem with a reasonable computational cost, and makes full use of the advantages of the two kinds of data pre-processing methods to reduce the data from feature and instance dimensions, and to generate a more optimized classifier on the reduced data. Experiments show the method has attained lower prediction error ratio. Meantime, an average of feature reduction ratio 50.8% and an average of instance reduction ratio 36.8% are obtained at the same time on seven data sets from the University of California, Irvine (UCI), indicting the availability of the method.