A Medium and Long-Term Runoff Forecast Method Based on Massive Meteorological Data and Machine Learning Algorithms

Accurate and reliable predictors selection and model construction are the key to medium and long-term runoff forecast. In this study, 130 climate indexes are utilized as the primary forecast factors. Partial Mutual Information (PMI), Recursive Feature Elimination (RFE) and Classification and Regress...

Full description

Bibliographic Details
Main Authors: Yujie Li, Dong Wang, Jing Wei, Bo Li, Bin Xu, Yueping Xu, Huaping Huang
Format: Article
Language:English
Published: MDPI AG 2021-05-01
Series:Water
Subjects:
Online Access:https://www.mdpi.com/2073-4441/13/9/1308
id doaj-5d95fd907d99420bb550407678132a0f
record_format Article
spelling doaj-5d95fd907d99420bb550407678132a0f2021-05-31T23:26:15ZengMDPI AGWater2073-44412021-05-01131308130810.3390/w13091308A Medium and Long-Term Runoff Forecast Method Based on Massive Meteorological Data and Machine Learning AlgorithmsYujie Li0Dong Wang1Jing Wei2Bo Li3Bin Xu4Yueping Xu5Huaping Huang6College of Civil Engineering and Architecture, Zhejiang University, Hangzhou 310058, ChinaChangjiang Water Resources Commission, Wuhan 430010, ChinaZhejiang Design Institute of Water Conservancy and Hydroelectric Power, Hangzhou 310002, ChinaZhejiang Design Institute of Water Conservancy and Hydroelectric Power, Hangzhou 310002, ChinaHangzhou Design Institute of Water Conservancy and Hydropower, Hangzhou 310016, ChinaCollege of Civil Engineering and Architecture, Zhejiang University, Hangzhou 310058, ChinaChina Water Resources Pearl River Planning Surveying & Designing Co., Ltd, Guangzhou 510610, ChinaAccurate and reliable predictors selection and model construction are the key to medium and long-term runoff forecast. In this study, 130 climate indexes are utilized as the primary forecast factors. Partial Mutual Information (PMI), Recursive Feature Elimination (RFE) and Classification and Regression Tree (CART) are respectively employed as the typical algorithms of Filter, Wrapper and Embedded based on Feature Selection (FS) to obtain three final forecast schemes. Random Forest (RF) and Extreme Gradient Boosting (XGB) are respectively constructed as the representative models of Bagging and Boosting based on Ensemble Learning (EL) to realize the forecast of the three types of forecast lead time which contains monthly, seasonal and annual runoff sequences of the Three Gorges Reservoir in the Yangtze River Basin. This study aims to summarize and compare the applicability and accuracy of different FS methods and EL models in medium and long-term runoff forecast. The results show the following: (1) RFE method shows the best forecast performance in all different models and different forecast lead time. (2) RF and XGB models are suitable for medium and long-term runoff forecast but XGB presents the better forecast skills both in calibration and validation. (3) With the increase of the runoff magnitudes, the accuracy and reliability of forecast are improved. However, it is still difficult to establish accurate and reliable forecasts only large-scale climate indexes used. We conclude that the theoretical framework based on Machine Learning could be useful to water managers who focus on medium and long-term runoff forecast.https://www.mdpi.com/2073-4441/13/9/1308medium and long-term runoff forecastmachine learningfeature selectionensemble learningrandom forestextreme gradient boosting
collection DOAJ
language English
format Article
sources DOAJ
author Yujie Li
Dong Wang
Jing Wei
Bo Li
Bin Xu
Yueping Xu
Huaping Huang
spellingShingle Yujie Li
Dong Wang
Jing Wei
Bo Li
Bin Xu
Yueping Xu
Huaping Huang
A Medium and Long-Term Runoff Forecast Method Based on Massive Meteorological Data and Machine Learning Algorithms
Water
medium and long-term runoff forecast
machine learning
feature selection
ensemble learning
random forest
extreme gradient boosting
author_facet Yujie Li
Dong Wang
Jing Wei
Bo Li
Bin Xu
Yueping Xu
Huaping Huang
author_sort Yujie Li
title A Medium and Long-Term Runoff Forecast Method Based on Massive Meteorological Data and Machine Learning Algorithms
title_short A Medium and Long-Term Runoff Forecast Method Based on Massive Meteorological Data and Machine Learning Algorithms
title_full A Medium and Long-Term Runoff Forecast Method Based on Massive Meteorological Data and Machine Learning Algorithms
title_fullStr A Medium and Long-Term Runoff Forecast Method Based on Massive Meteorological Data and Machine Learning Algorithms
title_full_unstemmed A Medium and Long-Term Runoff Forecast Method Based on Massive Meteorological Data and Machine Learning Algorithms
title_sort medium and long-term runoff forecast method based on massive meteorological data and machine learning algorithms
publisher MDPI AG
series Water
issn 2073-4441
publishDate 2021-05-01
description Accurate and reliable predictors selection and model construction are the key to medium and long-term runoff forecast. In this study, 130 climate indexes are utilized as the primary forecast factors. Partial Mutual Information (PMI), Recursive Feature Elimination (RFE) and Classification and Regression Tree (CART) are respectively employed as the typical algorithms of Filter, Wrapper and Embedded based on Feature Selection (FS) to obtain three final forecast schemes. Random Forest (RF) and Extreme Gradient Boosting (XGB) are respectively constructed as the representative models of Bagging and Boosting based on Ensemble Learning (EL) to realize the forecast of the three types of forecast lead time which contains monthly, seasonal and annual runoff sequences of the Three Gorges Reservoir in the Yangtze River Basin. This study aims to summarize and compare the applicability and accuracy of different FS methods and EL models in medium and long-term runoff forecast. The results show the following: (1) RFE method shows the best forecast performance in all different models and different forecast lead time. (2) RF and XGB models are suitable for medium and long-term runoff forecast but XGB presents the better forecast skills both in calibration and validation. (3) With the increase of the runoff magnitudes, the accuracy and reliability of forecast are improved. However, it is still difficult to establish accurate and reliable forecasts only large-scale climate indexes used. We conclude that the theoretical framework based on Machine Learning could be useful to water managers who focus on medium and long-term runoff forecast.
topic medium and long-term runoff forecast
machine learning
feature selection
ensemble learning
random forest
extreme gradient boosting
url https://www.mdpi.com/2073-4441/13/9/1308
work_keys_str_mv AT yujieli amediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT dongwang amediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT jingwei amediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT boli amediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT binxu amediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT yuepingxu amediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT huapinghuang amediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT yujieli mediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT dongwang mediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT jingwei mediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT boli mediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT binxu mediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT yuepingxu mediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
AT huapinghuang mediumandlongtermrunoffforecastmethodbasedonmassivemeteorologicaldataandmachinelearningalgorithms
_version_ 1721417497820790784