Machine learning and soil sciences: a review aided by machine learning tools

<p>The application of machine learning (ML) techniques in various fields of science has increased rapidly, especially in the last 10 years. The increasing availability of soil data that can be efficiently acquired remotely and proximally, and freely available open-source algorithms, have led t...

Full description

Bibliographic Details
Main Authors: J. Padarian, B. Minasny, A. B. McBratney
Format: Article
Language:English
Published: Copernicus Publications 2020-02-01
Series:SOIL
Online Access:https://www.soil-journal.net/6/35/2020/soil-6-35-2020.pdf
id doaj-e7ac7e7432f2483698bde2467d9e6b8c
record_format Article
spelling doaj-e7ac7e7432f2483698bde2467d9e6b8c2020-11-25T02:18:34ZengCopernicus PublicationsSOIL2199-39712199-398X2020-02-016355210.5194/soil-6-35-2020Machine learning and soil sciences: a review aided by machine learning toolsJ. PadarianB. MinasnyA. B. McBratney<p>The application of machine learning (ML) techniques in various fields of science has increased rapidly, especially in the last 10 years. The increasing availability of soil data that can be efficiently acquired remotely and proximally, and freely available open-source algorithms, have led to an accelerated adoption of ML techniques to analyse soil data. Given the large number of publications, it is an impossible task to manually review all papers on the application of ML in soil science without narrowing down a narrative of ML application in a specific research question. This paper aims to provide a comprehensive review of the application of ML techniques in soil science aided by a ML algorithm (latent Dirichlet allocation) to find patterns in a large collection of text corpora. The objective is to gain insight into publications of ML applications in soil science and to discuss the research gaps in this topic. We found that (a) there is an increasing usage of ML methods in soil sciences, mostly concentrated in developed countries, (b) the reviewed publications can be grouped into 12 topics, namely remote sensing, soil organic carbon, water, contamination, methods (ensembles), erosion and parent material, methods (NN, neural networks, SVM, support vector machines), spectroscopy, modelling (classes), crops, physical, and modelling (continuous), and (c) advanced ML methods usually perform better than simpler approaches thanks to their capability to capture non-linear relationships. From these findings, we found research gaps, in particular, about the precautions that should be taken (parsimony) to avoid overfitting, and that the interpretability of the ML models is an important aspect to consider when applying advanced ML methods in order to improve our knowledge and understanding of soil. We foresee that a large number of studies will focus on the latter topic.</p>https://www.soil-journal.net/6/35/2020/soil-6-35-2020.pdf
collection DOAJ
language English
format Article
sources DOAJ
author J. Padarian
B. Minasny
A. B. McBratney
spellingShingle J. Padarian
B. Minasny
A. B. McBratney
Machine learning and soil sciences: a review aided by machine learning tools
SOIL
author_facet J. Padarian
B. Minasny
A. B. McBratney
author_sort J. Padarian
title Machine learning and soil sciences: a review aided by machine learning tools
title_short Machine learning and soil sciences: a review aided by machine learning tools
title_full Machine learning and soil sciences: a review aided by machine learning tools
title_fullStr Machine learning and soil sciences: a review aided by machine learning tools
title_full_unstemmed Machine learning and soil sciences: a review aided by machine learning tools
title_sort machine learning and soil sciences: a review aided by machine learning tools
publisher Copernicus Publications
series SOIL
issn 2199-3971
2199-398X
publishDate 2020-02-01
description <p>The application of machine learning (ML) techniques in various fields of science has increased rapidly, especially in the last 10 years. The increasing availability of soil data that can be efficiently acquired remotely and proximally, and freely available open-source algorithms, have led to an accelerated adoption of ML techniques to analyse soil data. Given the large number of publications, it is an impossible task to manually review all papers on the application of ML in soil science without narrowing down a narrative of ML application in a specific research question. This paper aims to provide a comprehensive review of the application of ML techniques in soil science aided by a ML algorithm (latent Dirichlet allocation) to find patterns in a large collection of text corpora. The objective is to gain insight into publications of ML applications in soil science and to discuss the research gaps in this topic. We found that (a) there is an increasing usage of ML methods in soil sciences, mostly concentrated in developed countries, (b) the reviewed publications can be grouped into 12 topics, namely remote sensing, soil organic carbon, water, contamination, methods (ensembles), erosion and parent material, methods (NN, neural networks, SVM, support vector machines), spectroscopy, modelling (classes), crops, physical, and modelling (continuous), and (c) advanced ML methods usually perform better than simpler approaches thanks to their capability to capture non-linear relationships. From these findings, we found research gaps, in particular, about the precautions that should be taken (parsimony) to avoid overfitting, and that the interpretability of the ML models is an important aspect to consider when applying advanced ML methods in order to improve our knowledge and understanding of soil. We foresee that a large number of studies will focus on the latter topic.</p>
url https://www.soil-journal.net/6/35/2020/soil-6-35-2020.pdf
work_keys_str_mv AT jpadarian machinelearningandsoilsciencesareviewaidedbymachinelearningtools
AT bminasny machinelearningandsoilsciencesareviewaidedbymachinelearningtools
AT abmcbratney machinelearningandsoilsciencesareviewaidedbymachinelearningtools
_version_ 1724881326327201792