کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4375132 1303244 2011 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Bagging GLM: Improved generalized linear model for the analysis of zero-inflated data
موضوعات مرتبط
علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک بوم شناسی، تکامل، رفتار و سامانه شناسی
پیش نمایش صفحه اول مقاله
Bagging GLM: Improved generalized linear model for the analysis of zero-inflated data
چکیده انگلیسی

Species-occurrence data sets tend to contain a large proportion of zero values, i.e., absence values (zero-inflated). Statistical inference using such data sets is likely to be inefficient or lead to incorrect conclusions unless the data are treated carefully. In this study, we propose a new modeling method to overcome the problems caused by zero-inflated data sets that involves a regression model and a machine-learning technique. We combined a generalized liner model (GLM), which is widely used in ecology, and bootstrap aggregation (bagging), a machine-learning technique. We established distribution models of Vincetoxicum pycnostelma (a vascular plant) and Ninox scutulata (an owl), both of which are endangered and have zero-inflated distribution patterns, using our new method and traditional GLM and compared model performances. At the same time we modeled four theoretical data sets that contained different ratios of presence/absence values using new and traditional methods and also compared model performances. For distribution models, our new method showed good performance compared to traditional GLMs. After bagging, area under the curve (AUC) values were almost the same as with traditional methods, but sensitivity values were higher. Additionally, our new method showed high sensitivity values compared to the traditional GLM when modeling a theoretical data set containing a large proportion of zero values. These results indicate that our new method has high predictive ability with presence data when analyzing zero-inflated data sets. Generally, predicting presence data is more difficult than predicting absence data. Our new modeling method has potential for advancing species distribution modeling.

Research highlights
► This paper aims to propose a new modelling method for species distribution to overcome the problems caused by zero-inflated data sets.
► We combined a regression model and a machine-learning technique for new method.
► We compared model performances between new method and traditional method using several zero-inflated data sets.
► Our results showed that new method has high predictive ability with presence data when analyzing zero-inflated data sets.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Ecological Informatics - Volume 6, Issue 5, September 2011, Pages 270–275
نویسندگان
, , , ,