Article ID Journal Published Year Pages File Type
5132144 Chemometrics and Intelligent Laboratory Systems 2017 8 Pages PDF
Abstract

•A new measure for evaluating the predictive performance of regression models is proposed.•The proposed measure can consider the applicability domains (ADs) of regression models.•Some regression models have global predictive ability and others have local predictive ability.•The proposed measure can select between local and global regression models.•The prediction accuracy of data sets improved using the proposed measure.

The coefficient of determination and the root-mean-squared error (RMSE) evaluate regression models for test samples without considering the applicability domains (ADs) of the models. In this study, we propose a new measure for evaluating the predictive performance of regression models that considers their ADs. The purpose is not selecting the best regression model among various competing models, but determining an appropriate model group corresponding to the AD of each model. The proposed measure is the area under coverage and RMSE curve for coverage less than p% (p%-AUCR). It is confirmed that some regression models have global predictive ability and others have local predictive ability, and p%-AUCR is an appropriate indicator for selecting between local and global regression models depending on the coverage and considering the AD. Selecting a regression model for each sample or each chemical structure using p%-AUCR can improve the prediction accuracy of data sets.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
,