Rainfall forecasting model using machine learning methods: Case study Terengganu, Malaysia

Rainfall plays a main role in managing the water level in the reservoir. The unpredictable amount of rainfall due to the climate change can cause either overflow or dry in the reservoir. In this study, several models and methods were applied to predict the rainfall data in Tasik Kenyir, Terengganu....

Full description

Bibliographic Details
Main Authors: Wanie M. Ridwan, Michelle Sapitang, Awatif Aziz, Khairul Faizal Kushiar, Ali Najah Ahmed, Ahmed El-Shafie
Format: Article
Language:English
Published: Elsevier 2021-06-01
Series:Ain Shams Engineering Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2090447920302069
Description
Summary:Rainfall plays a main role in managing the water level in the reservoir. The unpredictable amount of rainfall due to the climate change can cause either overflow or dry in the reservoir. In this study, several models and methods were applied to predict the rainfall data in Tasik Kenyir, Terengganu. The comparative study was conducted focusing on developing and comparing several Machine Learning (ML) models, evaluating different scenarios and time horizon, and forecasting rainfall using two types of methods. Data involved for this research consist of taking the average rainfall from 10 stations around the study area using Thiessen polygon to weight the station area and projected rainfall. The forecasting model uses four different ML algorithms, which are Bayesian Linear Regression (BLR), Boosted Decision Tree Regression (BDTR), Decision Forest Regression (DFR) and Neural Network Regression (NNR). On the other hand, the rainfall was predicted on different time horizon by using different ML’s algorithms which is method 1 (M1): Forecasting Rainfall Using Autocorrelation Function (ACF) and method 2 (M2): Forecasting Rainfall Using Projected Error. In M1, the best regression developed for ACF is BDTR since it has the highest coefficient of determination, R2, after tuning the hyperparameter. The results show coefficient between 0.5 and 0.9 with the highest of each scenarios for daily (0.9739693), weekly (0.989461), 10-days (0.9894429) and monthly (0.9998085). In M2, overall model performances show that normalization using LogNormal is preferably giving a good result of each categories except for 10-days with BDTR and DFR are the most acceptable result than NNR and BLR. It is concluded that, two different methods have been applied with different scenarios and different time horizons, and M1 shows a rather high accuracy than M2 using BDTR modeling.
ISSN:2090-4479