Long-Term Hindcasts of Wheat Yield in Fields Using Remotely Sensed Phenology, Climate Data and Machine Learning

Satellite remote sensing offers a cost-effective means of generating long-term hindcasts of yield that can be used to understand how yield varies in time and space. This study investigated the use of remotely sensed phenology, climate data and machine learning for estimating yield at a resolution su...

Full description

Bibliographic Details
Main Authors:	Fiona H. Evans, Jianxiu Shen
Format:	Article
Language:	English
Published:	MDPI AG 2021-06-01
Series:	Remote Sensing
Subjects:	Landsat NDVI crop phenology yield estimation long-term hindcasts
Online Access:	https://www.mdpi.com/2072-4292/13/13/2435

id	doaj-7ebeb810dc154e2895fc93507473af2d
record_format	Article
spelling	doaj-7ebeb810dc154e2895fc93507473af2d2021-07-15T15:44:03ZengMDPI AGRemote Sensing2072-42922021-06-01132435243510.3390/rs13132435Long-Term Hindcasts of Wheat Yield in Fields Using Remotely Sensed Phenology, Climate Data and Machine LearningFiona H. Evans0Jianxiu Shen1Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, 90 South Street, Murdoch, WA 6150, AustraliaCentre for Crop and Food Innovation, Food Futures Institute, Murdoch University, 90 South Street, Murdoch, WA 6150, AustraliaSatellite remote sensing offers a cost-effective means of generating long-term hindcasts of yield that can be used to understand how yield varies in time and space. This study investigated the use of remotely sensed phenology, climate data and machine learning for estimating yield at a resolution suitable for optimising crop management in fields. We used spatially weighted growth curve estimation to identify the timing of phenological events from sequences of Landsat NDVI and derive phenological and seasonal climate metrics. Using data from a 17,000 ha study area, we investigated the relationships between the metrics and yield over 17 years from 2003 to 2019. We compared six statistical and machine learning models for estimating yield: multiple linear regression, mixed effects models, generalised additive models, random forests, support vector regression using radial basis functions and deep learning neural networks. We used a 50-50 train-test split on paddock-years where 50% of paddock-year combinations were randomly selected and used to train each model and the remaining 50% of paddock-years were used to assess the model accuracy. Using only phenological metrics, accuracy was highest using a linear mixed model with a random effect that allowed the relationship between integrated NDVI and yield to vary by year <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo stretchy="false">(</mo><msup><mi mathvariant="normal">R</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula> = 0.67, MAE = 0.25 t ha<sup>−</sup><sup>1</sup>, RMSE = 0.33 t ha<sup>−1</sup>, NRMSE = 0.25). We quantified the improvements in accuracy when seasonal climate metrics were also used as predictors. We identified two optimal models using the combined phenological and seasonal climate metrics: support vector regression and deep learning models (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msup><mi mathvariant="normal">R</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula> = 0.68, MAE = 0.25 t ha<sup>−1</sup>, RMSE = 0.32 t ha<sup>−1</sup>, NRMSE = 0.25). While the linear mixed model using only phenological metrics performed similarly to the nonlinear models that are also seasonal climate metrics, the nonlinear models can be more easily generalised to estimate yield in years for which training data are unavailable. We conclude that long-term hindcasts of wheat yield in fields, at 30 m spatial resolution, can be produced using remotely sensed phenology from Landsat NDVI, climate data and machine learning.https://www.mdpi.com/2072-4292/13/13/2435LandsatNDVIcrop phenologyyield estimationlong-termhindcasts
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Fiona H. Evans Jianxiu Shen
spellingShingle	Fiona H. Evans Jianxiu Shen Long-Term Hindcasts of Wheat Yield in Fields Using Remotely Sensed Phenology, Climate Data and Machine Learning Remote Sensing Landsat NDVI crop phenology yield estimation long-term hindcasts
author_facet	Fiona H. Evans Jianxiu Shen
author_sort	Fiona H. Evans
title	Long-Term Hindcasts of Wheat Yield in Fields Using Remotely Sensed Phenology, Climate Data and Machine Learning
title_short	Long-Term Hindcasts of Wheat Yield in Fields Using Remotely Sensed Phenology, Climate Data and Machine Learning
title_full	Long-Term Hindcasts of Wheat Yield in Fields Using Remotely Sensed Phenology, Climate Data and Machine Learning
title_fullStr	Long-Term Hindcasts of Wheat Yield in Fields Using Remotely Sensed Phenology, Climate Data and Machine Learning
title_full_unstemmed	Long-Term Hindcasts of Wheat Yield in Fields Using Remotely Sensed Phenology, Climate Data and Machine Learning
title_sort	long-term hindcasts of wheat yield in fields using remotely sensed phenology, climate data and machine learning
publisher	MDPI AG
series	Remote Sensing
issn	2072-4292
publishDate	2021-06-01
description	Satellite remote sensing offers a cost-effective means of generating long-term hindcasts of yield that can be used to understand how yield varies in time and space. This study investigated the use of remotely sensed phenology, climate data and machine learning for estimating yield at a resolution suitable for optimising crop management in fields. We used spatially weighted growth curve estimation to identify the timing of phenological events from sequences of Landsat NDVI and derive phenological and seasonal climate metrics. Using data from a 17,000 ha study area, we investigated the relationships between the metrics and yield over 17 years from 2003 to 2019. We compared six statistical and machine learning models for estimating yield: multiple linear regression, mixed effects models, generalised additive models, random forests, support vector regression using radial basis functions and deep learning neural networks. We used a 50-50 train-test split on paddock-years where 50% of paddock-year combinations were randomly selected and used to train each model and the remaining 50% of paddock-years were used to assess the model accuracy. Using only phenological metrics, accuracy was highest using a linear mixed model with a random effect that allowed the relationship between integrated NDVI and yield to vary by year <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo stretchy="false">(</mo><msup><mi mathvariant="normal">R</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula> = 0.67, MAE = 0.25 t ha<sup>−</sup><sup>1</sup>, RMSE = 0.33 t ha<sup>−1</sup>, NRMSE = 0.25). We quantified the improvements in accuracy when seasonal climate metrics were also used as predictors. We identified two optimal models using the combined phenological and seasonal climate metrics: support vector regression and deep learning models (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msup><mi mathvariant="normal">R</mi><mn>2</mn></msup></mrow></semantics></math></inline-formula> = 0.68, MAE = 0.25 t ha<sup>−1</sup>, RMSE = 0.32 t ha<sup>−1</sup>, NRMSE = 0.25). While the linear mixed model using only phenological metrics performed similarly to the nonlinear models that are also seasonal climate metrics, the nonlinear models can be more easily generalised to estimate yield in years for which training data are unavailable. We conclude that long-term hindcasts of wheat yield in fields, at 30 m spatial resolution, can be produced using remotely sensed phenology from Landsat NDVI, climate data and machine learning.
topic	Landsat NDVI crop phenology yield estimation long-term hindcasts
url	https://www.mdpi.com/2072-4292/13/13/2435
work_keys_str_mv	AT fionahevans longtermhindcastsofwheatyieldinfieldsusingremotelysensedphenologyclimatedataandmachinelearning AT jianxiushen longtermhindcastsofwheatyieldinfieldsusingremotelysensedphenologyclimatedataandmachinelearning
_version_	1721298649567199232

Long-Term Hindcasts of Wheat Yield in Fields Using Remotely Sensed Phenology, Climate Data and Machine Learning

Similar Items