ROC curves and nonrandom data

Article ID	Journal	Published Year	Pages	File Type
4970333	Pattern Recognition Letters	2017	7 Pages	PDF

Abstract

â¢This paper shows that ROC curves that are constructed with nonrandom data are biased.â¢The magnitude of this bias is explored using simulations.â¢A procedure for plotting consistent ROC curves is introduced.â¢The presented procedure works well with simulated and non-simulated data.

This paper shows that when a classifier is evaluated with nonrandom test data, ROC curves differ from the ROC curves that would be obtained with a random sample. To address this bias, this paper introduces a procedure for plotting ROC curves that are inferred from nonrandom test data. I provide simulations to illustrate the procedure as well as the magnitude of bias that is found in empirical ROC curves constructed with nonrandom test data. The paper also includes a demonstration of the procedure on (non-simulated) data used to model wine preferences in the wine industry.

Keywords

41A05 65D05 41A10 65D17 Classifier evaluation ROC curves