The large scale digital mapping of soil organic carbon using machine learning algorithms
The results of digital mapping of organic carbon content within the arable horizons of soils and the assessment of obtained models accuracy with the use of machine learning methods for the area of Central Russian Upland in Voronezh Oblast are presented. The digital mapping was based on 22 points of...
Main Authors: | , |
---|---|
Format: | Article |
Language: | Russian |
Published: |
V.V. Dokuchaev Soil Science Institute
2018-03-01
|
Series: | Бюллетень Почвенного института им. В.В. Докучаева |
Subjects: | |
Online Access: | https://bulletin.esoil.ru/jour/article/view/202 |
Summary: | The results of digital mapping of organic carbon content within the arable horizons of soils and the assessment of obtained models accuracy with the use of machine learning methods for the area of Central Russian Upland in Voronezh Oblast are presented. The digital mapping was based on 22 points of soil samplings, applied for the learning and verification of models, and also on several sets of predictor variables. We took also digital elevation model, its derivatives and also remote sensing data of different spatial resolution as predictor variables. Several methods were used to create the spatial variability models for the investigated property based on the decision trees methods: random forest, boosting regression trees and Bayessian regression trees. The assessment of the models obtained accuracy was conducted by a method of cross-validation. As the accuracy indices we used the determination coefficient, mean absolute error and the root mean square error. The modelling results showed that the use of predictor variables presented by digital elevation model, its derivatives and Landsat 8 data we were able to obtain more sustainable models. The determination coefficient varied from 0.6 to 0.7, RMSEcv, i.e., the prognosing error varied from 0.5791 to 0.6520. Whereas, the best model was obtained with the method of Bayessian regression trees; whereas the predictor variables presented by the digital elevation model, its derivatives and Sentinel 2 data determination coefficient varied from 0.47 to 0.55, and the prognosing error varied from 0.7031 to 0.7909. It was revealed that in the described models according to different data sets the most significant were the various predictor variables. |
---|---|
ISSN: | 0136-1694 2312-4202 |