Article ID Journal Published Year Pages File Type
415531 Computational Statistics & Data Analysis 2007 5 Pages PDF
Abstract

Random kk-nearest-neighbour (RKNN) imputation is an established algorithm for filling in missing values in data sets. Assume that data are missing in a random way, so that missingness is independent of unobserved values (MAR), and assume there is a minimum positive probability of a response vector being complete. Then RKNN, with kk equal to the square root of the sample size, asymptotically produces independent values with the correct probability distribution for the ones that are missing. An experiment illustrates two different distance functions for a synthetic data set.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
,