The importance of making testable predictions: A cautionary tale.

We found a startling correlation (Pearson ρ > 0.97) between a single event in daily sea surface temperatures each spring, and peak fish egg abundance measurements the following summer, in 7 years of approximately weekly fish egg abundance data collected at Scripps Pier in La Jolla California. Eve...

Full description

Bibliographic Details
Main Authors: Emma S Choi, Erik Saberski, Tom Lorimer, Cameron Smith, Unduwap Kandage-Don, Ronald S Burton, George Sugihara
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0236541
id doaj-0b4c4b4e0c4645a990379f7c35b65916
record_format Article
spelling doaj-0b4c4b4e0c4645a990379f7c35b659162021-03-04T12:45:59ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-011512e023654110.1371/journal.pone.0236541The importance of making testable predictions: A cautionary tale.Emma S ChoiErik SaberskiTom LorimerCameron SmithUnduwap Kandage-DonRonald S BurtonGeorge SugiharaWe found a startling correlation (Pearson ρ > 0.97) between a single event in daily sea surface temperatures each spring, and peak fish egg abundance measurements the following summer, in 7 years of approximately weekly fish egg abundance data collected at Scripps Pier in La Jolla California. Even more surprising was that this event-based result persisted despite the large and variable number of fish species involved (up to 46), and the large and variable time interval between trigger and response (up to ~3 months). To mitigate potential over-fitting, we made an out-of-sample prediction beyond the publication process for the peak summer egg abundance observed at Scripps Pier in 2020 (available on bioRxiv). During peer-review, the prediction failed, and while it would be tempting to explain this away as a result of the record-breaking toxic algal bloom that occurred during the spring (9x higher concentration of dinoflagellates than ever previously recorded), a re-examination of our methodology revealed a potential source of over-fitting that had not been evaluated for robustness. This cautionary tale highlights the importance of testable true out-of-sample predictions of future values that cannot (even accidentally) be used in model fitting, and that can therefore catch model assumptions that may otherwise escape notice. We believe that this example can benefit the current push towards ecology as a predictive science and support the notion that predictions should live and die in the public domain, along with the models that made them.https://doi.org/10.1371/journal.pone.0236541
collection DOAJ
language English
format Article
sources DOAJ
author Emma S Choi
Erik Saberski
Tom Lorimer
Cameron Smith
Unduwap Kandage-Don
Ronald S Burton
George Sugihara
spellingShingle Emma S Choi
Erik Saberski
Tom Lorimer
Cameron Smith
Unduwap Kandage-Don
Ronald S Burton
George Sugihara
The importance of making testable predictions: A cautionary tale.
PLoS ONE
author_facet Emma S Choi
Erik Saberski
Tom Lorimer
Cameron Smith
Unduwap Kandage-Don
Ronald S Burton
George Sugihara
author_sort Emma S Choi
title The importance of making testable predictions: A cautionary tale.
title_short The importance of making testable predictions: A cautionary tale.
title_full The importance of making testable predictions: A cautionary tale.
title_fullStr The importance of making testable predictions: A cautionary tale.
title_full_unstemmed The importance of making testable predictions: A cautionary tale.
title_sort importance of making testable predictions: a cautionary tale.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2020-01-01
description We found a startling correlation (Pearson ρ > 0.97) between a single event in daily sea surface temperatures each spring, and peak fish egg abundance measurements the following summer, in 7 years of approximately weekly fish egg abundance data collected at Scripps Pier in La Jolla California. Even more surprising was that this event-based result persisted despite the large and variable number of fish species involved (up to 46), and the large and variable time interval between trigger and response (up to ~3 months). To mitigate potential over-fitting, we made an out-of-sample prediction beyond the publication process for the peak summer egg abundance observed at Scripps Pier in 2020 (available on bioRxiv). During peer-review, the prediction failed, and while it would be tempting to explain this away as a result of the record-breaking toxic algal bloom that occurred during the spring (9x higher concentration of dinoflagellates than ever previously recorded), a re-examination of our methodology revealed a potential source of over-fitting that had not been evaluated for robustness. This cautionary tale highlights the importance of testable true out-of-sample predictions of future values that cannot (even accidentally) be used in model fitting, and that can therefore catch model assumptions that may otherwise escape notice. We believe that this example can benefit the current push towards ecology as a predictive science and support the notion that predictions should live and die in the public domain, along with the models that made them.
url https://doi.org/10.1371/journal.pone.0236541
work_keys_str_mv AT emmaschoi theimportanceofmakingtestablepredictionsacautionarytale
AT eriksaberski theimportanceofmakingtestablepredictionsacautionarytale
AT tomlorimer theimportanceofmakingtestablepredictionsacautionarytale
AT cameronsmith theimportanceofmakingtestablepredictionsacautionarytale
AT unduwapkandagedon theimportanceofmakingtestablepredictionsacautionarytale
AT ronaldsburton theimportanceofmakingtestablepredictionsacautionarytale
AT georgesugihara theimportanceofmakingtestablepredictionsacautionarytale
AT emmaschoi importanceofmakingtestablepredictionsacautionarytale
AT eriksaberski importanceofmakingtestablepredictionsacautionarytale
AT tomlorimer importanceofmakingtestablepredictionsacautionarytale
AT cameronsmith importanceofmakingtestablepredictionsacautionarytale
AT unduwapkandagedon importanceofmakingtestablepredictionsacautionarytale
AT ronaldsburton importanceofmakingtestablepredictionsacautionarytale
AT georgesugihara importanceofmakingtestablepredictionsacautionarytale
_version_ 1714801630086430720