Article ID Journal Published Year Pages File Type
416504 Computational Statistics & Data Analysis 2012 14 Pages PDF
Abstract

Missing data often occur in regression analysis. Imputation, weighting, direct likelihood, and Bayesian inference are typical approaches for missing data analysis. The focus is on missing covariate data, a common complication in the analysis of sample surveys and clinical trials. A key quantity when applying weighted estimators is the mean score contribution of observations with missing covariate(s), conditional on the observed covariates. This mean score can be estimated parametrically or nonparametrically by its empirical average using the complete case data in case of repeated values of the observed covariates, typically assuming categorical or categorized covariates. A nonparametric kernel based estimator is proposed for this mean score, allowing the full exploitation of the continuous nature of the covariates. The performance of the kernel based method is compared to that of a complete case analysis, inverse probability weighting, doubly robust estimators and multiple imputation, through simulations.

► We give an overview of the existing methods to handle missing covariate data. ► We introduce a semi-parametric, kernel based method to handle missing covariate data. ► We illustrate (dis)advantages of the several methods based on simulation studies. ► The new proposal behaves very well compared to most other methods. ► The new proposal is not suffering from misspecified weighting or imputation models.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , , ,