Double-blind evaluation and benchmarking of survival models in a multi-centre study

Article ID	Journal	Published Year	Pages	File Type
506047	Computers in Biology and Medicine	2007	13 Pages	PDF

Abstract

Accurate modelling of time-to-event data is of particular importance for both exploratory and predictive analysis in cancer, and can have a direct impact on clinical care. This study presents a detailed double-blind evaluation of the accuracy in out-of-sample prediction of mortality from two generic non-linear models, using artificial neural networks benchmarked against a partial logistic spline, log-normal and COX regression models. A data set containing 2880 samples was shared over the Internet using a purpose-built secure environment called GEOCONDA (www.geoconda.com). The evaluation was carried out in three parts. The first was a comparison between the predicted survival estimates for each of the four survival groups defined by the TNM staging system, against the empirical estimates derived by the Kaplan–Meier method. The second approach focused on the accurate prediction of survival over time, quantified with the time dependent CC index (Ctd)Ctd). Finally, calibration plots were obtained over the range of follow-up and tested using a generalization of the Hosmer–Lemeshow test. All models showed satisfactory performance, with values of CtdCtd of about 0.7. None of the models showed a systematic tendency towards over/under estimation of the observed survival at τ=3τ=3 and 5 years. At τ=10τ=10 years, all models underestimated the observed survival, except for COX regression which returned an overestimate. The study presents a robust and unbiased benchmarking methodology using a bespoke web facility. It was concluded that powerful, recent flexible modelling algorithms show a comparative predictive performance to that of more established methods from the medical and biological literature, for the reference data set.

Keywords

Uveal neoplasms survival analysis evaluation studies Double-blind study