Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region
Sediment runoff from dense highland field areas greatly affects the quality of downstream lakes and drinking water sources. In this study, multiple linear regression (MLR) models were built to predict diffuse pollutant discharge using the environmental parameters of a basin. Explanatory variables th...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2018-08-01
|
Series: | Water |
Subjects: | |
Online Access: | http://www.mdpi.com/2073-4441/10/9/1156 |
id |
doaj-e44ae22ad55148cc802142ae2d9704ad |
---|---|
record_format |
Article |
spelling |
doaj-e44ae22ad55148cc802142ae2d9704ad2020-11-24T23:55:18ZengMDPI AGWater2073-44412018-08-01109115610.3390/w10091156w10091156Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural RegionJae Heon Cho0Jong Ho Lee1Department of Biosystems and Convergence Engineering, Catholic Kwandong University, 24 Beomil-ro, 579 beon-gil, Gangneung-si, Gangwon-do 25601, KoreaDepartment of Urban Planning and Real Estate, Cheongju University, 298 Daeseongro, Cheongwon-gu, Cheongju, Chungbuk 28503, KoreaSediment runoff from dense highland field areas greatly affects the quality of downstream lakes and drinking water sources. In this study, multiple linear regression (MLR) models were built to predict diffuse pollutant discharge using the environmental parameters of a basin. Explanatory variables that influence the sediment and pollutant discharge can be identified with the model, and such research could play an important role in limiting sediment erosion in the dense highland field area. Pollutant load per event, event mean concentration (EMC), and pollutant load per area were estimated from stormwater survey data from the Lake Soyang basin. During the wet season, heavy rains cause large amounts of suspended sediment and the occurrence of such rains is increasing due to climate change. The explanatory variables used in the MLR models are the percentage of fields, subbasin area, and mean slope of subbasin as topographic parameters, and the number of preceding dry days, rainfall intensity, rainfall depth, and rainfall duration as rainfall parameters. In the MLR modeling process, four types of regression equations with and without log transformation of the explanatory and response variables were examined to identify the best performing regression model. The performance of the MLR models was evaluated using the coefficient of determination (R2), root mean square error (RMSE), coefficient of variation of the root mean square error (CV(RMSE)), the ratio of the RMSE to the standard deviation of the observed data (RSR) and the Nash–Sutcliffe model efficiency (NSE). The performance of the MLR models of pollutant load except total nitrogen (TN) was good under the condition of RSR, and satisfactory for the NSE and R2. In the EMC and load/area models, the performance for suspended solids (SS) and total phosphorus (TP) was good for the RSR, and satisfactory for the NSE and R2. The standardized coefficients for the models were analyzed to identify the influential explanatory variables in the models. In the final performance evaluation, the results of jackknife validation indicate that the MLR models are robust.http://www.mdpi.com/2073-4441/10/9/1156highland agricultural field areadiffuse pollutant dischargemultiple regression modelclimate changejackknife validation |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jae Heon Cho Jong Ho Lee |
spellingShingle |
Jae Heon Cho Jong Ho Lee Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region Water highland agricultural field area diffuse pollutant discharge multiple regression model climate change jackknife validation |
author_facet |
Jae Heon Cho Jong Ho Lee |
author_sort |
Jae Heon Cho |
title |
Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region |
title_short |
Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region |
title_full |
Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region |
title_fullStr |
Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region |
title_full_unstemmed |
Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region |
title_sort |
multiple linear regression models for predicting nonpoint-source pollutant discharge from a highland agricultural region |
publisher |
MDPI AG |
series |
Water |
issn |
2073-4441 |
publishDate |
2018-08-01 |
description |
Sediment runoff from dense highland field areas greatly affects the quality of downstream lakes and drinking water sources. In this study, multiple linear regression (MLR) models were built to predict diffuse pollutant discharge using the environmental parameters of a basin. Explanatory variables that influence the sediment and pollutant discharge can be identified with the model, and such research could play an important role in limiting sediment erosion in the dense highland field area. Pollutant load per event, event mean concentration (EMC), and pollutant load per area were estimated from stormwater survey data from the Lake Soyang basin. During the wet season, heavy rains cause large amounts of suspended sediment and the occurrence of such rains is increasing due to climate change. The explanatory variables used in the MLR models are the percentage of fields, subbasin area, and mean slope of subbasin as topographic parameters, and the number of preceding dry days, rainfall intensity, rainfall depth, and rainfall duration as rainfall parameters. In the MLR modeling process, four types of regression equations with and without log transformation of the explanatory and response variables were examined to identify the best performing regression model. The performance of the MLR models was evaluated using the coefficient of determination (R2), root mean square error (RMSE), coefficient of variation of the root mean square error (CV(RMSE)), the ratio of the RMSE to the standard deviation of the observed data (RSR) and the Nash–Sutcliffe model efficiency (NSE). The performance of the MLR models of pollutant load except total nitrogen (TN) was good under the condition of RSR, and satisfactory for the NSE and R2. In the EMC and load/area models, the performance for suspended solids (SS) and total phosphorus (TP) was good for the RSR, and satisfactory for the NSE and R2. The standardized coefficients for the models were analyzed to identify the influential explanatory variables in the models. In the final performance evaluation, the results of jackknife validation indicate that the MLR models are robust. |
topic |
highland agricultural field area diffuse pollutant discharge multiple regression model climate change jackknife validation |
url |
http://www.mdpi.com/2073-4441/10/9/1156 |
work_keys_str_mv |
AT jaeheoncho multiplelinearregressionmodelsforpredictingnonpointsourcepollutantdischargefromahighlandagriculturalregion AT jongholee multiplelinearregressionmodelsforpredictingnonpointsourcepollutantdischargefromahighlandagriculturalregion |
_version_ |
1725463188361707520 |