Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers
Spatial regression or interpolation is widely used for predictive soil pollution mapping, which aims to estimate all unobserved soil pollution based on a finite number of sample points. However, it may be unreasonable to use spatial regression or interpolation directly for an environmental soil data...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8673734/ |
id |
doaj-844ba77a471d4c27a0b149acf640f911 |
---|---|
record_format |
Article |
spelling |
doaj-844ba77a471d4c27a0b149acf640f9112021-03-29T22:28:19ZengIEEEIEEE Access2169-35362019-01-017466684667610.1109/ACCESS.2019.29071988673734Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With OutliersWentao Yang0Min Deng1https://orcid.org/0000-0002-3035-9116Xuexi Yang2Dongsheng Wei3National-Local Joint Engineering Laboratory of Geospatial Information Technology, Hunan University of Science and Technology, Xiangtan, ChinaSchool of Geosciences and Info-Physics, Central South University, Changsha, ChinaSchool of Geosciences and Info-Physics, Central South University, Changsha, ChinaCollege of Civil Engineering, Central South University of Forestry and Technology, Changsha, ChinaSpatial regression or interpolation is widely used for predictive soil pollution mapping, which aims to estimate all unobserved soil pollution based on a finite number of sample points. However, it may be unreasonable to use spatial regression or interpolation directly for an environmental soil dataset with outliers, because the mechanism generating outlier datasets is always different from that generating normal datasets, which necessitates handling outliers separately. Therefore, a hybrid approach for estimating unknown soil pollution concentrations is developed in this paper. The hybrid approach comprises three main steps: First, spatial outlier detection is used to uncover abnormal sample points and the study area is then divided into the normal and outlier areas. Second, spatial regression and interpolation are applied to analyze the normal and outlier datasets, respectively. Finally, the results of the predictive soil pollution mapping are derived from the prediction combination of spatial regression and interpolation. An environmental dataset recording heavy metal Cd and As concentrations at Huizhou, China was selected to verify the performance of the proposed approach. The numbers of identified outlier points of heavy metal Cd and As concentrations were 16 and 13. For the prediction result of Cd, the mean square error (MSE) and mean relative error (MRE) of the hybrid approach were about 0.028 and 0.332, respectively. For the prediction result of As, the MSE and MRE of the hybrid approach were about 3.834 and 0.366, respectively. All of these values were smaller than those of models used for comparison. The result of the comparative analysis demonstrates the feasibility and effectiveness of the proposed approach.https://ieeexplore.ieee.org/document/8673734/Land pollutiongeographic information systemspatial databasedata mining |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Wentao Yang Min Deng Xuexi Yang Dongsheng Wei |
spellingShingle |
Wentao Yang Min Deng Xuexi Yang Dongsheng Wei Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers IEEE Access Land pollution geographic information system spatial database data mining |
author_facet |
Wentao Yang Min Deng Xuexi Yang Dongsheng Wei |
author_sort |
Wentao Yang |
title |
Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers |
title_short |
Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers |
title_full |
Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers |
title_fullStr |
Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers |
title_full_unstemmed |
Predictive Soil Pollution Mapping: A Hybrid Approach for a Dataset With Outliers |
title_sort |
predictive soil pollution mapping: a hybrid approach for a dataset with outliers |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
Spatial regression or interpolation is widely used for predictive soil pollution mapping, which aims to estimate all unobserved soil pollution based on a finite number of sample points. However, it may be unreasonable to use spatial regression or interpolation directly for an environmental soil dataset with outliers, because the mechanism generating outlier datasets is always different from that generating normal datasets, which necessitates handling outliers separately. Therefore, a hybrid approach for estimating unknown soil pollution concentrations is developed in this paper. The hybrid approach comprises three main steps: First, spatial outlier detection is used to uncover abnormal sample points and the study area is then divided into the normal and outlier areas. Second, spatial regression and interpolation are applied to analyze the normal and outlier datasets, respectively. Finally, the results of the predictive soil pollution mapping are derived from the prediction combination of spatial regression and interpolation. An environmental dataset recording heavy metal Cd and As concentrations at Huizhou, China was selected to verify the performance of the proposed approach. The numbers of identified outlier points of heavy metal Cd and As concentrations were 16 and 13. For the prediction result of Cd, the mean square error (MSE) and mean relative error (MRE) of the hybrid approach were about 0.028 and 0.332, respectively. For the prediction result of As, the MSE and MRE of the hybrid approach were about 3.834 and 0.366, respectively. All of these values were smaller than those of models used for comparison. The result of the comparative analysis demonstrates the feasibility and effectiveness of the proposed approach. |
topic |
Land pollution geographic information system spatial database data mining |
url |
https://ieeexplore.ieee.org/document/8673734/ |
work_keys_str_mv |
AT wentaoyang predictivesoilpollutionmappingahybridapproachforadatasetwithoutliers AT mindeng predictivesoilpollutionmappingahybridapproachforadatasetwithoutliers AT xuexiyang predictivesoilpollutionmappingahybridapproachforadatasetwithoutliers AT dongshengwei predictivesoilpollutionmappingahybridapproachforadatasetwithoutliers |
_version_ |
1724191524029202432 |