Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores
<p>A traditional metric used in hydrology to summarize model performance is the Nash–Sutcliffe efficiency (NSE). Increasingly an alternative metric, the Kling–Gupta efficiency (KGE), is used instead. When NSE is used, NSE <span class="inline-formula">=</span>&a...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Copernicus Publications
2019-10-01
|
Series: | Hydrology and Earth System Sciences |
Online Access: | https://www.hydrol-earth-syst-sci.net/23/4323/2019/hess-23-4323-2019.pdf |
Summary: | <p>A traditional metric used in hydrology to summarize model
performance is the Nash–Sutcliffe efficiency (NSE). Increasingly an
alternative metric, the Kling–Gupta efficiency (KGE), is used instead. When
NSE is used, NSE <span class="inline-formula">=</span> 0 corresponds to using the mean flow as a benchmark
predictor. The same reasoning is applied in various studies that use KGE as
a metric: negative KGE values are viewed as bad model performance, and only
positive values are seen as good model performance. Here we show that using
the mean flow as a predictor does not result in KGE <span class="inline-formula">=</span> 0, but instead KGE <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M3" display="inline" overflow="scroll" dspmath="mathml"><mrow><mo>=</mo><mn mathvariant="normal">1</mn><mo>-</mo><mo>√</mo><mn mathvariant="normal">2</mn><mo>≈</mo><mo>-</mo><mn mathvariant="normal">0.41</mn></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="86pt" height="13pt" class="svg-formula" dspmath="mathimg" md5hash="524cd584100cb659f79b83ac051cff83"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="hess-23-4323-2019-ie00001.svg" width="86pt" height="13pt" src="hess-23-4323-2019-ie00001.png"/></svg:svg></span></span>. Thus, KGE values greater than <span class="inline-formula">−0.41</span>
indicate that a model improves upon the mean flow benchmark – even if the
model's KGE value is negative. NSE and KGE values cannot be directly
compared, because their relationship is non-unique and depends in part on
the coefficient of variation of the observed time series. Therefore,
modellers who use the KGE metric should not let their understanding of NSE
values guide them in interpreting KGE values and instead develop new
understanding based on the constitutive parts of the KGE metric and the
explicit use of benchmark values to compare KGE scores against. More
generally, a strong case can be made for moving away from ad hoc use of
aggregated efficiency metrics and towards a framework based on
purpose-dependent evaluation metrics and benchmarks that allows for more
robust model adequacy assessment.</p> |
---|---|
ISSN: | 1027-5606 1607-7938 |