کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6408426 1629452 2016 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An overview and comparison of machine-learning techniques for classification purposes in digital soil mapping
ترجمه فارسی عنوان
یک مرور کلی و مقایسه تکنیک های یادگیری ماشین برای اهداف طبقه بندی در نقشه برداری خاک دیجیتال
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه علوم زمین و سیارات فرآیندهای سطح زمین
چکیده انگلیسی


- Soil taxonomic units were mapped for the Lower Fraser Valley.
- 10 machine-learning algorithms were compared.
- Four methods of developing training data were compared.
- Sampling from soil surveys using an area-weighted approach was most effective.
- Choice of model and sampling design greatly influences outputs.

Machine-learning is the automated process of uncovering patterns in large datasets using computer-based statistical models, where a fitted model may then be used for prediction purposes on new data. Despite the growing number of machine-learning algorithms that have been developed, relatively few studies have provided a comparison of an array of different learners - typically, model comparison studies have been restricted to a comparison of only a few models. This study evaluates and compares a suite of 10 machine-learners as classification algorithms for the prediction of soil taxonomic units in the Lower Fraser Valley, British Columbia, Canada.A variety of machine-learners (CART, CART with bagging, Random Forest, k-nearest neighbor, nearest shrunken centroid, artificial neural network, multinomial logistic regression, logistic model trees, and support vector machine) were tested in the extraction of the complex relationships between soil taxonomic units (great groups and orders) from a conventional soil survey and a suite of 20 environmental covariates representing the topography, climate, and vegetation of the study area. Methods used to extract training data from a soil survey included by-polygon, equal-class, area-weighted, and area-weighted with random over sampling (ROS) approaches. The fitted models, which consist of the soil-environmental relationships, were then used to predict soil great groups and orders for the entire study area at a 100 m spatial resolution. The resulting maps were validated using 262 points from legacy soil data.On average, the area-weighted sampling approach for developing training data from a soil survey was most effective. Using a validation of R = 1 cell, the k-nearest neighbor and support vector machine with radial basis function resulted in the highest accuracy of 72% for great groups using ROS; however, models such as CART with bagging, logistic model trees, and Random Forest were preferred due to the speed of parameterization and the interpretability of the results while resulting in similar accuracies ranging from 65-70% using the area-weighted sampling approach. Model choice and sample design greatly influenced outputs. This study provides a comprehensive comparison of machine-learning techniques for classification purposes in soil science and may assist in model selection for digital soil mapping and geomorphic modeling studies in the future.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Geoderma - Volume 265, 1 March 2016, Pages 62-77
نویسندگان
, , , , , ,