Article ID Journal Published Year Pages File Type
6338832 Atmospheric Environment 2015 9 Pages PDF
Abstract

•We propose a method for imputation of missing values in times series.•Simulations showed adequate goodness-of-fit.•The findings also suggest good accuracy and precision.•We implemented the method as an open source R library.

Missing data are major concerns in epidemiological studies of the health effects of environmental air pollutants. This article presents an imputation-based method that is suitable for multivariate time series data, which uses the EM algorithm under the assumption of normal distribution. Different approaches are considered for filtering the temporal component. A simulation study was performed to assess validity and performance of proposed method in comparison with some frequently used methods. Simulations showed that when the amount of missing data was as low as 5%, the complete data analysis yielded satisfactory results regardless of the generating mechanism of the missing data, whereas the validity began to degenerate when the proportion of missing values exceeded 10%. The proposed imputation method exhibited good accuracy and precision in different settings with respect to the patterns of missing observations. Most of the imputations obtained valid results, even under missing not at random. The methods proposed in this study are implemented as a package called mtsdi for the statistical software system R.

Related Topics
Physical Sciences and Engineering Earth and Planetary Sciences Atmospheric Science
Authors
, ,