Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region

Sediment runoff from dense highland field areas greatly affects the quality of downstream lakes and drinking water sources. In this study, multiple linear regression (MLR) models were built to predict diffuse pollutant discharge using the environmental parameters of a basin. Explanatory variables th...

Full description

Bibliographic Details
Main Authors: Jae Heon Cho, Jong Ho Lee
Format: Article
Language:English
Published: MDPI AG 2018-08-01
Series:Water
Subjects:
Online Access:http://www.mdpi.com/2073-4441/10/9/1156
id doaj-e44ae22ad55148cc802142ae2d9704ad
record_format Article
spelling doaj-e44ae22ad55148cc802142ae2d9704ad2020-11-24T23:55:18ZengMDPI AGWater2073-44412018-08-01109115610.3390/w10091156w10091156Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural RegionJae Heon Cho0Jong Ho Lee1Department of Biosystems and Convergence Engineering, Catholic Kwandong University, 24 Beomil-ro, 579 beon-gil, Gangneung-si, Gangwon-do 25601, KoreaDepartment of Urban Planning and Real Estate, Cheongju University, 298 Daeseongro, Cheongwon-gu, Cheongju, Chungbuk 28503, KoreaSediment runoff from dense highland field areas greatly affects the quality of downstream lakes and drinking water sources. In this study, multiple linear regression (MLR) models were built to predict diffuse pollutant discharge using the environmental parameters of a basin. Explanatory variables that influence the sediment and pollutant discharge can be identified with the model, and such research could play an important role in limiting sediment erosion in the dense highland field area. Pollutant load per event, event mean concentration (EMC), and pollutant load per area were estimated from stormwater survey data from the Lake Soyang basin. During the wet season, heavy rains cause large amounts of suspended sediment and the occurrence of such rains is increasing due to climate change. The explanatory variables used in the MLR models are the percentage of fields, subbasin area, and mean slope of subbasin as topographic parameters, and the number of preceding dry days, rainfall intensity, rainfall depth, and rainfall duration as rainfall parameters. In the MLR modeling process, four types of regression equations with and without log transformation of the explanatory and response variables were examined to identify the best performing regression model. The performance of the MLR models was evaluated using the coefficient of determination (R2), root mean square error (RMSE), coefficient of variation of the root mean square error (CV(RMSE)), the ratio of the RMSE to the standard deviation of the observed data (RSR) and the Nash–Sutcliffe model efficiency (NSE). The performance of the MLR models of pollutant load except total nitrogen (TN) was good under the condition of RSR, and satisfactory for the NSE and R2. In the EMC and load/area models, the performance for suspended solids (SS) and total phosphorus (TP) was good for the RSR, and satisfactory for the NSE and R2. The standardized coefficients for the models were analyzed to identify the influential explanatory variables in the models. In the final performance evaluation, the results of jackknife validation indicate that the MLR models are robust.http://www.mdpi.com/2073-4441/10/9/1156highland agricultural field areadiffuse pollutant dischargemultiple regression modelclimate changejackknife validation
collection DOAJ
language English
format Article
sources DOAJ
author Jae Heon Cho
Jong Ho Lee
spellingShingle Jae Heon Cho
Jong Ho Lee
Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region
Water
highland agricultural field area
diffuse pollutant discharge
multiple regression model
climate change
jackknife validation
author_facet Jae Heon Cho
Jong Ho Lee
author_sort Jae Heon Cho
title Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region
title_short Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region
title_full Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region
title_fullStr Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region
title_full_unstemmed Multiple Linear Regression Models for Predicting Nonpoint-Source Pollutant Discharge from a Highland Agricultural Region
title_sort multiple linear regression models for predicting nonpoint-source pollutant discharge from a highland agricultural region
publisher MDPI AG
series Water
issn 2073-4441
publishDate 2018-08-01
description Sediment runoff from dense highland field areas greatly affects the quality of downstream lakes and drinking water sources. In this study, multiple linear regression (MLR) models were built to predict diffuse pollutant discharge using the environmental parameters of a basin. Explanatory variables that influence the sediment and pollutant discharge can be identified with the model, and such research could play an important role in limiting sediment erosion in the dense highland field area. Pollutant load per event, event mean concentration (EMC), and pollutant load per area were estimated from stormwater survey data from the Lake Soyang basin. During the wet season, heavy rains cause large amounts of suspended sediment and the occurrence of such rains is increasing due to climate change. The explanatory variables used in the MLR models are the percentage of fields, subbasin area, and mean slope of subbasin as topographic parameters, and the number of preceding dry days, rainfall intensity, rainfall depth, and rainfall duration as rainfall parameters. In the MLR modeling process, four types of regression equations with and without log transformation of the explanatory and response variables were examined to identify the best performing regression model. The performance of the MLR models was evaluated using the coefficient of determination (R2), root mean square error (RMSE), coefficient of variation of the root mean square error (CV(RMSE)), the ratio of the RMSE to the standard deviation of the observed data (RSR) and the Nash–Sutcliffe model efficiency (NSE). The performance of the MLR models of pollutant load except total nitrogen (TN) was good under the condition of RSR, and satisfactory for the NSE and R2. In the EMC and load/area models, the performance for suspended solids (SS) and total phosphorus (TP) was good for the RSR, and satisfactory for the NSE and R2. The standardized coefficients for the models were analyzed to identify the influential explanatory variables in the models. In the final performance evaluation, the results of jackknife validation indicate that the MLR models are robust.
topic highland agricultural field area
diffuse pollutant discharge
multiple regression model
climate change
jackknife validation
url http://www.mdpi.com/2073-4441/10/9/1156
work_keys_str_mv AT jaeheoncho multiplelinearregressionmodelsforpredictingnonpointsourcepollutantdischargefromahighlandagriculturalregion
AT jongholee multiplelinearregressionmodelsforpredictingnonpointsourcepollutantdischargefromahighlandagriculturalregion
_version_ 1725463188361707520