Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4946104 | Knowledge-Based Systems | 2017 | 15 Pages |
Abstract
The incomplete datasets with missing values are unsuitable for making strategic decisions since they lead to biased results. This problem is even worse when the dataset is large and collected from many heterogeneous sources. The paper deals with missing scenarios which were not dealt together earlier. The proposed Dual Repopulated Bayesian Ant Colony Optimization (DPBACO) handles both ignorable and non-ignorable missing values in heterogeneous attributes of large datasets The DPBACO integrates Bayesian principles with Ant Colony Optimization technique since both are simple and efficient to implement. After pheromone updation, repopulation of the solution pool is done by dividing the population into two based on their fitness values and generating new offsprings by performing crossover operation. The DPBACO algorithm is implemented on six large mixed-attribute datasets for imputing both kinds of missing values. The empirical and statistical results show that DPBACO performs better than other existing methods at variable missing rates ranging from 5% to 50%.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
R. Devi Priya, R. Sivaraj, N. Sasi Priyaa,