Effects of sample size on the accuracy of geomorphological models

Article ID	Journal	Published Year	Pages	File Type
4686484	Geomorphology	2008	10 Pages	PDF

Abstract

Commonly, the most costly part of geomorphological distribution modelling studies is gathering the data. Thus, guidance for researchers concerning the quantity of field data needed would be extremely practical. This paper scrutinises the relationship between the sample size (the number of observations varied from 20 to 600) and the predictive ability of the generalized linear model (GLM), generalized additive model (GAM), generalized boosting method (GBM) and artificial neural network (ANN) in two data settings, i.e., independent and split-sample approaches. The study was performed using empirical data of periglacial processes from an area of 600Â km2 in northernmost Finland at grid resolutions of 1Â ha (100Â ÃÂ 100Â m) and 25Â ha (500Â ÃÂ 500Â m). A rather sharp increase in the predictive ability of the models was observed when the number of observations increased from 20 to 100, and the level of robust predictions was reached with 200 observations. The result indicates that no more than a few hundred observations are needed in geomorphological distribution modelling at a medium scale resolution (ca. 0.01-1Â km2).

Keywords

periglacial generalized additive model (GAM)Artificial neural networks (ANN)Generalized linear model (GLM)Predictive mapping