Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores

<p>A traditional metric used in hydrology to summarize model performance is the Nash–Sutcliffe efficiency (NSE). Increasingly an alternative metric, the Kling–Gupta efficiency (KGE), is used instead. When NSE is used, NSE&thinsp;<span class="inline-formula">=</span>&a...

Full description

Bibliographic Details
Main Authors: W. J. M. Knoben, J. E. Freer, R. A. Woods
Format: Article
Language:English
Published: Copernicus Publications 2019-10-01
Series:Hydrology and Earth System Sciences
Online Access:https://www.hydrol-earth-syst-sci.net/23/4323/2019/hess-23-4323-2019.pdf
id doaj-490c879c254b475d8c96702277f4293d
record_format Article
spelling doaj-490c879c254b475d8c96702277f4293d2020-11-25T01:05:54ZengCopernicus PublicationsHydrology and Earth System Sciences1027-56061607-79382019-10-01234323433110.5194/hess-23-4323-2019Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scoresW. J. M. Knoben0W. J. M. Knoben1J. E. Freer2J. E. Freer3R. A. Woods4R. A. Woods5Department of Civil Engineering, University of Bristol, Bristol, BS8 1TR, UKnow at: University of Saskatchewan Coldwater Laboratory, Canmore, Alberta, CanadaSchool of Geographical Sciences, University of Bristol, Bristol, BS8 1BF, UKCabot Institute, University of Bristol, Bristol, BS8 1UJ, UKDepartment of Civil Engineering, University of Bristol, Bristol, BS8 1TR, UKCabot Institute, University of Bristol, Bristol, BS8 1UJ, UK<p>A traditional metric used in hydrology to summarize model performance is the Nash–Sutcliffe efficiency (NSE). Increasingly an alternative metric, the Kling–Gupta efficiency (KGE), is used instead. When NSE is used, NSE&thinsp;<span class="inline-formula">=</span>&thinsp;0 corresponds to using the mean flow as a benchmark predictor. The same reasoning is applied in various studies that use KGE as a metric: negative KGE values are viewed as bad model performance, and only positive values are seen as good model performance. Here we show that using the mean flow as a predictor does not result in KGE&thinsp;<span class="inline-formula">=</span>&thinsp;0, but instead KGE&thinsp;<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M3" display="inline" overflow="scroll" dspmath="mathml"><mrow><mo>=</mo><mn mathvariant="normal">1</mn><mo>-</mo><mo>√</mo><mn mathvariant="normal">2</mn><mo>≈</mo><mo>-</mo><mn mathvariant="normal">0.41</mn></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="86pt" height="13pt" class="svg-formula" dspmath="mathimg" md5hash="524cd584100cb659f79b83ac051cff83"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="hess-23-4323-2019-ie00001.svg" width="86pt" height="13pt" src="hess-23-4323-2019-ie00001.png"/></svg:svg></span></span>. Thus, KGE values greater than <span class="inline-formula">−0.41</span> indicate that a model improves upon the mean flow benchmark – even if the model's KGE value is negative. NSE and KGE values cannot be directly compared, because their relationship is non-unique and depends in part on the coefficient of variation of the observed time series. Therefore, modellers who use the KGE metric should not let their understanding of NSE values guide them in interpreting KGE values and instead develop new understanding based on the constitutive parts of the KGE metric and the explicit use of benchmark values to compare KGE scores against. More generally, a strong case can be made for moving away from ad hoc use of aggregated efficiency metrics and towards a framework based on purpose-dependent evaluation metrics and benchmarks that allows for more robust model adequacy assessment.</p>https://www.hydrol-earth-syst-sci.net/23/4323/2019/hess-23-4323-2019.pdf
collection DOAJ
language English
format Article
sources DOAJ
author W. J. M. Knoben
W. J. M. Knoben
J. E. Freer
J. E. Freer
R. A. Woods
R. A. Woods
spellingShingle W. J. M. Knoben
W. J. M. Knoben
J. E. Freer
J. E. Freer
R. A. Woods
R. A. Woods
Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores
Hydrology and Earth System Sciences
author_facet W. J. M. Knoben
W. J. M. Knoben
J. E. Freer
J. E. Freer
R. A. Woods
R. A. Woods
author_sort W. J. M. Knoben
title Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores
title_short Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores
title_full Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores
title_fullStr Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores
title_full_unstemmed Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores
title_sort technical note: inherent benchmark or not? comparing nash–sutcliffe and kling–gupta efficiency scores
publisher Copernicus Publications
series Hydrology and Earth System Sciences
issn 1027-5606
1607-7938
publishDate 2019-10-01
description <p>A traditional metric used in hydrology to summarize model performance is the Nash–Sutcliffe efficiency (NSE). Increasingly an alternative metric, the Kling–Gupta efficiency (KGE), is used instead. When NSE is used, NSE&thinsp;<span class="inline-formula">=</span>&thinsp;0 corresponds to using the mean flow as a benchmark predictor. The same reasoning is applied in various studies that use KGE as a metric: negative KGE values are viewed as bad model performance, and only positive values are seen as good model performance. Here we show that using the mean flow as a predictor does not result in KGE&thinsp;<span class="inline-formula">=</span>&thinsp;0, but instead KGE&thinsp;<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M3" display="inline" overflow="scroll" dspmath="mathml"><mrow><mo>=</mo><mn mathvariant="normal">1</mn><mo>-</mo><mo>√</mo><mn mathvariant="normal">2</mn><mo>≈</mo><mo>-</mo><mn mathvariant="normal">0.41</mn></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="86pt" height="13pt" class="svg-formula" dspmath="mathimg" md5hash="524cd584100cb659f79b83ac051cff83"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="hess-23-4323-2019-ie00001.svg" width="86pt" height="13pt" src="hess-23-4323-2019-ie00001.png"/></svg:svg></span></span>. Thus, KGE values greater than <span class="inline-formula">−0.41</span> indicate that a model improves upon the mean flow benchmark – even if the model's KGE value is negative. NSE and KGE values cannot be directly compared, because their relationship is non-unique and depends in part on the coefficient of variation of the observed time series. Therefore, modellers who use the KGE metric should not let their understanding of NSE values guide them in interpreting KGE values and instead develop new understanding based on the constitutive parts of the KGE metric and the explicit use of benchmark values to compare KGE scores against. More generally, a strong case can be made for moving away from ad hoc use of aggregated efficiency metrics and towards a framework based on purpose-dependent evaluation metrics and benchmarks that allows for more robust model adequacy assessment.</p>
url https://www.hydrol-earth-syst-sci.net/23/4323/2019/hess-23-4323-2019.pdf
work_keys_str_mv AT wjmknoben technicalnoteinherentbenchmarkornotcomparingnashsutcliffeandklingguptaefficiencyscores
AT wjmknoben technicalnoteinherentbenchmarkornotcomparingnashsutcliffeandklingguptaefficiencyscores
AT jefreer technicalnoteinherentbenchmarkornotcomparingnashsutcliffeandklingguptaefficiencyscores
AT jefreer technicalnoteinherentbenchmarkornotcomparingnashsutcliffeandklingguptaefficiencyscores
AT rawoods technicalnoteinherentbenchmarkornotcomparingnashsutcliffeandklingguptaefficiencyscores
AT rawoods technicalnoteinherentbenchmarkornotcomparingnashsutcliffeandklingguptaefficiencyscores
_version_ 1725192591793717248