Article ID Journal Published Year Pages File Type
416289 Computational Statistics & Data Analysis 2015 16 Pages PDF
Abstract

Missing data raise problems in almost all fields of quantitative research. A useful nonparametric procedure is the nearest neighbor imputation method. Improved versions of this method are presented. First, a weighted nearest neighbor imputation method based on LqLq distances is proposed. It is demonstrated that the method tends to have a smaller imputation error than other nearest neighbor estimates. Then weighted nearest neighbor imputation methods that use distances for selected covariates are considered. The careful selection of distances that carry information about the missing values yields an imputation tool that can outperform competing nearest neighbor methods. This approach performs well, especially when the number of predictors is large. The methods are evaluated in simulation studies and with several real data sets from different fields.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, ,