کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
5770426 1629427 2017 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Comparing the use of training data derived from legacy soil pits and soil survey polygons for mapping soil classes
ترجمه فارسی عنوان
مقایسه استفاده از داده های آموزشی مشتق شده از چاله های بوته ای و چند ضلعی بررسی خاک برای طبقه بندی خاک
کلمات کلیدی
نقشه برداری خاک دیجیتال، فراگیری ماشین، طبقه بندی خاک، داده کاوی، مقایسه مدل، گروه یادگیری،
موضوعات مرتبط
مهندسی و علوم پایه علوم زمین و سیارات فرآیندهای سطح زمین
چکیده انگلیسی


- Soil Great Groups were mapped for the Okanagan-Kamloops region.
- Training data derived from legacy soil pit and soil survey polygons were compared.
- Single-learner and ensemble-learner models were compared.
- Random Forest was most effective regardless of training data type.
- Predictions made from polygon-derived training data had higher accuracies.

Machine-learners used for digital soil mapping are generally trained using either data derived from field-observed soil pits or from soil survey polygons - although no direct comparison of the accuracy resulting from the two methods has yet to be undertaken. This study examined such a comparison over the Okanagan Valley and Kamloops region of British Columbia where good quality soil pit and soil survey data were available. A standard set of environmental variables including vegetative, climatic, and topographic indices were used to predict soil Great Groups in accordance with the Canadian System of Soil Classification. The pit-derived training dataset was developed using n = 478 points from the British Columbia Soil Information System while the polygon-derived training dataset was developed through random sampling of single-component soil survey map units based on an area-weighted approach. In both cases, the training points were intersected with a suite of 18 environmental covariates, reduced from 27 covariates using principal component analysis, and submitted to a machine-learner for predictions at a 100 m spatial resolution. Four single-model learners (CART, k-nearest neighbor, multinomial logistic regression, and logistic model tree) and five ensemble-model learners (CART with bagging, k-nearest neighbor with bagging, multinomial logistic regression with bagging, logistic model trees with bagging, and Random Forest) were compared. Surfaces of prediction uncertainty were produced using ignorance uncertainty and results were validated using a 5-fold cross-validation procedure. Predictions made using polygon-derived training data were consistently higher in accuracy across all models where the Random Forest model was the most effective learner with C = 61% accuracy when using pit-derived training data and C = 68% accuracy when using polygon-derived training data. Comparing single-model and ensemble-learner models, the bagging algorithm resulted in a 2-11% increase in accuracy when using pit-derived training data. Ensemble-models allowed for the visualization of prediction uncertainty. This study provides further insight into the use of legacy soil data and the development of training data for digital soil mapping.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Geoderma - Volume 290, 15 March 2017, Pages 51-68
نویسندگان
, , ,