Comparison of Machine Learning Regression Algorithms for Cotton Leaf Area Index Retrieval Using Sentinel-2 Spectral Bands

Leaf area index (LAI) is a crucial crop biophysical parameter that has been widely used in a variety of fields. Five state-of-the-art machine learning regression algorithms (MLRAs), namely, artificial neural network (ANN), support vector regression (SVR), Gaussian process regression (GPR), random fo...

Full description

Bibliographic Details
Main Authors: Huihui Mao, Jihua Meng, Fujiang Ji, Qiankun Zhang, Huiting Fang
Format: Article
Language:English
Published: MDPI AG 2019-04-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/9/7/1459
id doaj-576179f335a54a04a8e0cb65ec7ef887
record_format Article
collection DOAJ
language English
format Article
sources DOAJ
author Huihui Mao
Jihua Meng
Fujiang Ji
Qiankun Zhang
Huiting Fang
spellingShingle Huihui Mao
Jihua Meng
Fujiang Ji
Qiankun Zhang
Huiting Fang
Comparison of Machine Learning Regression Algorithms for Cotton Leaf Area Index Retrieval Using Sentinel-2 Spectral Bands
Applied Sciences
leaf area index (LAI)
machine learning
Sentinel-2
sensitivity analysis
training sample size
spectral bands
author_facet Huihui Mao
Jihua Meng
Fujiang Ji
Qiankun Zhang
Huiting Fang
author_sort Huihui Mao
title Comparison of Machine Learning Regression Algorithms for Cotton Leaf Area Index Retrieval Using Sentinel-2 Spectral Bands
title_short Comparison of Machine Learning Regression Algorithms for Cotton Leaf Area Index Retrieval Using Sentinel-2 Spectral Bands
title_full Comparison of Machine Learning Regression Algorithms for Cotton Leaf Area Index Retrieval Using Sentinel-2 Spectral Bands
title_fullStr Comparison of Machine Learning Regression Algorithms for Cotton Leaf Area Index Retrieval Using Sentinel-2 Spectral Bands
title_full_unstemmed Comparison of Machine Learning Regression Algorithms for Cotton Leaf Area Index Retrieval Using Sentinel-2 Spectral Bands
title_sort comparison of machine learning regression algorithms for cotton leaf area index retrieval using sentinel-2 spectral bands
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2019-04-01
description Leaf area index (LAI) is a crucial crop biophysical parameter that has been widely used in a variety of fields. Five state-of-the-art machine learning regression algorithms (MLRAs), namely, artificial neural network (ANN), support vector regression (SVR), Gaussian process regression (GPR), random forest (RF) and gradient boosting regression tree (GBRT), have been used in the retrieval of cotton LAI with Sentinel-2 spectral bands. The performances of the five machine learning models are compared for better applications of MLRAs in remote sensing, since challenging problems remain in the selection of MLRAs for crop LAI retrieval, as well as the decision as to the optimal number for the training sample size and spectral bands to different MLRAs. A comprehensive evaluation was employed with respect to model accuracy, computational efficiency, sensitivity to training sample size and sensitivity to spectral bands. We conducted the comparison of five MLRAs in an agricultural area of Northwest China over three cotton seasons with the corresponding field campaigns for modeling and validation. Results show that the GBRT model outperforms the other models with respect to model accuracy in average (<inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <msup> <mi>R</mi> <mn>2</mn> </msup> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.854, <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>R</mi> <mi>M</mi> <mi>S</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.674 and <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>M</mi> <mi>A</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.456). SVR achieves the best performance in computational efficiency, which means it is fast to train, and to validate that it has great potentials to deliver near-real-time operational products for crop management. As for sensitivity to training sample size, GBRT behaves as the most robust model, and provides the best model accuracy on the average among the variations of training sample size, compared with other models (<inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <msup> <mi>R</mi> <mn>2</mn> </msup> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.884, <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>R</mi> <mi>M</mi> <mi>S</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.615 and <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>M</mi> <mi>A</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.452). Spectral bands sensitivity analysis with dCor (distance correlation), combined with the backward elimination approach, indicates that SVR, GPR and RF provide relatively robust performance to the spectral bands, while ANN outperforms the other models in terms of model accuracy on the average among the reduction of spectral bands (<inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <msup> <mi>R</mi> <mn>2</mn> </msup> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.881, <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>R</mi> <mi>M</mi> <mi>S</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.625 and <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>M</mi> <mi>A</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.480). A comprehensive evaluation indicates that GBRT is an appealing alternative for cotton LAI retrieval, except for its computational efficiency. Despite the different performance of the ML models, all models exhibited considerable potential for cotton LAI retrieval, which could offer accurate crop parameters information timely and accurately for crop fields management and agricultural production decisions.
topic leaf area index (LAI)
machine learning
Sentinel-2
sensitivity analysis
training sample size
spectral bands
url https://www.mdpi.com/2076-3417/9/7/1459
work_keys_str_mv AT huihuimao comparisonofmachinelearningregressionalgorithmsforcottonleafareaindexretrievalusingsentinel2spectralbands
AT jihuameng comparisonofmachinelearningregressionalgorithmsforcottonleafareaindexretrievalusingsentinel2spectralbands
AT fujiangji comparisonofmachinelearningregressionalgorithmsforcottonleafareaindexretrievalusingsentinel2spectralbands
AT qiankunzhang comparisonofmachinelearningregressionalgorithmsforcottonleafareaindexretrievalusingsentinel2spectralbands
AT huitingfang comparisonofmachinelearningregressionalgorithmsforcottonleafareaindexretrievalusingsentinel2spectralbands
_version_ 1725128706888826880
spelling doaj-576179f335a54a04a8e0cb65ec7ef8872020-11-25T01:21:53ZengMDPI AGApplied Sciences2076-34172019-04-0197145910.3390/app9071459app9071459Comparison of Machine Learning Regression Algorithms for Cotton Leaf Area Index Retrieval Using Sentinel-2 Spectral BandsHuihui Mao0Jihua Meng1Fujiang Ji2Qiankun Zhang3Huiting Fang4Key Laboratory of Digital Earth Sciences, Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100101, ChinaKey Laboratory of Digital Earth Sciences, Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100101, ChinaKey Laboratory of Digital Earth Sciences, Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100101, ChinaKey Laboratory of Digital Earth Sciences, Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100101, ChinaKey Laboratory of Digital Earth Sciences, Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100101, ChinaLeaf area index (LAI) is a crucial crop biophysical parameter that has been widely used in a variety of fields. Five state-of-the-art machine learning regression algorithms (MLRAs), namely, artificial neural network (ANN), support vector regression (SVR), Gaussian process regression (GPR), random forest (RF) and gradient boosting regression tree (GBRT), have been used in the retrieval of cotton LAI with Sentinel-2 spectral bands. The performances of the five machine learning models are compared for better applications of MLRAs in remote sensing, since challenging problems remain in the selection of MLRAs for crop LAI retrieval, as well as the decision as to the optimal number for the training sample size and spectral bands to different MLRAs. A comprehensive evaluation was employed with respect to model accuracy, computational efficiency, sensitivity to training sample size and sensitivity to spectral bands. We conducted the comparison of five MLRAs in an agricultural area of Northwest China over three cotton seasons with the corresponding field campaigns for modeling and validation. Results show that the GBRT model outperforms the other models with respect to model accuracy in average (<inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <msup> <mi>R</mi> <mn>2</mn> </msup> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.854, <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>R</mi> <mi>M</mi> <mi>S</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.674 and <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>M</mi> <mi>A</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.456). SVR achieves the best performance in computational efficiency, which means it is fast to train, and to validate that it has great potentials to deliver near-real-time operational products for crop management. As for sensitivity to training sample size, GBRT behaves as the most robust model, and provides the best model accuracy on the average among the variations of training sample size, compared with other models (<inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <msup> <mi>R</mi> <mn>2</mn> </msup> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.884, <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>R</mi> <mi>M</mi> <mi>S</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.615 and <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>M</mi> <mi>A</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.452). Spectral bands sensitivity analysis with dCor (distance correlation), combined with the backward elimination approach, indicates that SVR, GPR and RF provide relatively robust performance to the spectral bands, while ANN outperforms the other models in terms of model accuracy on the average among the reduction of spectral bands (<inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <msup> <mi>R</mi> <mn>2</mn> </msup> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.881, <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>R</mi> <mi>M</mi> <mi>S</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.625 and <inline-formula> <math display="inline"> <semantics> <mrow> <mover accent="true"> <mrow> <mi>M</mi> <mi>A</mi> <mi>E</mi> </mrow> <mo stretchy="true">&#175;</mo> </mover> </mrow> </semantics> </math> </inline-formula> = 0.480). A comprehensive evaluation indicates that GBRT is an appealing alternative for cotton LAI retrieval, except for its computational efficiency. Despite the different performance of the ML models, all models exhibited considerable potential for cotton LAI retrieval, which could offer accurate crop parameters information timely and accurately for crop fields management and agricultural production decisions.https://www.mdpi.com/2076-3417/9/7/1459leaf area index (LAI)machine learningSentinel-2sensitivity analysistraining sample sizespectral bands