Article ID Journal Published Year Pages File Type
416255 Computational Statistics & Data Analysis 2016 8 Pages PDF
Abstract

This study examines measures of predictive power for a generalized linear model (GLM). Although many measures of predictive power for GLMs have been proposed, most have limitations. Hence, we focus on the regression correlation coefficient (RCC) (Zheng and Agresti, 2000), which satisfies the four requirements of (i) interpretability, (ii) applicability, (iii) consistency, and (iv) affinity. The RCC is a population value that is defined by the correlation between a response variable and the conditional expectation of the response variable. Its sample value is defined by the sample correlation between the observed response values and estimated values of the response variable. For an arbitrary GLM, we do not always have an explicit form of the RCC. However, for a Poisson regression model, assuming that the predictor variables have a multivariate normal distribution, we can find the explicit form of the RCC (true value). Therefore, it is possible to compare the estimators (sample values) of the RCC in terms of bias and RMSE (root of the mean square error) by using the true value. Furthermore, by using the explicit form, we propose a new estimator of the RCC for the Poisson regression model. We then compare the new estimator with the sample correlation estimator, the jack-knife estimator, and the leave-one-out cross validation estimator in terms of bias and RMSE. The leave-one-out cross validation estimator has large negative bias and large RMSE. Although the remaining three estimators show similar behavior for a large sample size, for a small sample size the new estimator shows the best behavior in terms of bias and RMSE.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, ,