Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers

Spatial regression or interpolation is widely used for predictive soil pollution mapping, which aims to estimate all unobserved soil pollution based on a finite number of sample points. However, it may be unreasonable to use spatial regression or interpolation directly for an environmental soil data...

Full description

Bibliographic Details
Main Authors: Wentao Yang, Min Deng, Xuexi Yang, Dongsheng Wei
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8673734/
id doaj-844ba77a471d4c27a0b149acf640f911
record_format Article
spelling doaj-844ba77a471d4c27a0b149acf640f9112021-03-29T22:28:19ZengIEEEIEEE Access2169-35362019-01-017466684667610.1109/ACCESS.2019.29071988673734Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With OutliersWentao Yang0Min Deng1https://orcid.org/0000-0002-3035-9116Xuexi Yang2Dongsheng Wei3National-Local Joint Engineering Laboratory of Geospatial Information Technology, Hunan University of Science and Technology, Xiangtan, ChinaSchool of Geosciences and Info-Physics, Central South University, Changsha, ChinaSchool of Geosciences and Info-Physics, Central South University, Changsha, ChinaCollege of Civil Engineering, Central South University of Forestry and Technology, Changsha, ChinaSpatial regression or interpolation is widely used for predictive soil pollution mapping, which aims to estimate all unobserved soil pollution based on a finite number of sample points. However, it may be unreasonable to use spatial regression or interpolation directly for an environmental soil dataset with outliers, because the mechanism generating outlier datasets is always different from that generating normal datasets, which necessitates handling outliers separately. Therefore, a hybrid approach for estimating unknown soil pollution concentrations is developed in this paper. The hybrid approach comprises three main steps: First, spatial outlier detection is used to uncover abnormal sample points and the study area is then divided into the normal and outlier areas. Second, spatial regression and interpolation are applied to analyze the normal and outlier datasets, respectively. Finally, the results of the predictive soil pollution mapping are derived from the prediction combination of spatial regression and interpolation. An environmental dataset recording heavy metal Cd and As concentrations at Huizhou, China was selected to verify the performance of the proposed approach. The numbers of identified outlier points of heavy metal Cd and As concentrations were 16 and 13. For the prediction result of Cd, the mean square error (MSE) and mean relative error (MRE) of the hybrid approach were about 0.028 and 0.332, respectively. For the prediction result of As, the MSE and MRE of the hybrid approach were about 3.834 and 0.366, respectively. All of these values were smaller than those of models used for comparison. The result of the comparative analysis demonstrates the feasibility and effectiveness of the proposed approach.https://ieeexplore.ieee.org/document/8673734/Land pollutiongeographic information systemspatial databasedata mining
collection DOAJ
language English
format Article
sources DOAJ
author Wentao Yang
Min Deng
Xuexi Yang
Dongsheng Wei
spellingShingle Wentao Yang
Min Deng
Xuexi Yang
Dongsheng Wei
Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers
IEEE Access
Land pollution
geographic information system
spatial database
data mining
author_facet Wentao Yang
Min Deng
Xuexi Yang
Dongsheng Wei
author_sort Wentao Yang
title Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers
title_short Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers
title_full Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers
title_fullStr Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers
title_full_unstemmed Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers
title_sort predictive soil pollution mapping: a hybrid approach for a dataset with outliers
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description Spatial regression or interpolation is widely used for predictive soil pollution mapping, which aims to estimate all unobserved soil pollution based on a finite number of sample points. However, it may be unreasonable to use spatial regression or interpolation directly for an environmental soil dataset with outliers, because the mechanism generating outlier datasets is always different from that generating normal datasets, which necessitates handling outliers separately. Therefore, a hybrid approach for estimating unknown soil pollution concentrations is developed in this paper. The hybrid approach comprises three main steps: First, spatial outlier detection is used to uncover abnormal sample points and the study area is then divided into the normal and outlier areas. Second, spatial regression and interpolation are applied to analyze the normal and outlier datasets, respectively. Finally, the results of the predictive soil pollution mapping are derived from the prediction combination of spatial regression and interpolation. An environmental dataset recording heavy metal Cd and As concentrations at Huizhou, China was selected to verify the performance of the proposed approach. The numbers of identified outlier points of heavy metal Cd and As concentrations were 16 and 13. For the prediction result of Cd, the mean square error (MSE) and mean relative error (MRE) of the hybrid approach were about 0.028 and 0.332, respectively. For the prediction result of As, the MSE and MRE of the hybrid approach were about 3.834 and 0.366, respectively. All of these values were smaller than those of models used for comparison. The result of the comparative analysis demonstrates the feasibility and effectiveness of the proposed approach.
topic Land pollution
geographic information system
spatial database
data mining
url https://ieeexplore.ieee.org/document/8673734/
work_keys_str_mv AT wentaoyang predictivesoilpollutionmappingahybridapproachforadatasetwithoutliers
AT mindeng predictivesoilpollutionmappingahybridapproachforadatasetwithoutliers
AT xuexiyang predictivesoilpollutionmappingahybridapproachforadatasetwithoutliers
AT dongshengwei predictivesoilpollutionmappingahybridapproachforadatasetwithoutliers
_version_ 1724191524029202432