Spatiotemporal continuous estimates of PM2.5 concentrations in China, 2000–2016: A machine learning method with inputs from satellites, chemical transport model, and ground observations

Ambient exposure to fine particulate matter (PM2.5) is known to harm public health in China. Satellite remote sensing measurements of aerosol optical depth (AOD) were statistically associated with in-situ observations after 2013 to predict PM2.5 concentrations nationwide, while the lack of surface m...

Full description

Bibliographic Details
Main Authors: Tao Xue, Yixuan Zheng, Dan Tong, Bo Zheng, Xin Li, Tong Zhu, Qiang Zhang
Format: Article
Language:English
Published: Elsevier 2019-02-01
Series:Environment International
Online Access:http://www.sciencedirect.com/science/article/pii/S0160412018316623
id doaj-94641564e67242db8a9245f403199a0a
record_format Article
collection DOAJ
language English
format Article
sources DOAJ
author Tao Xue
Yixuan Zheng
Dan Tong
Bo Zheng
Xin Li
Tong Zhu
Qiang Zhang
spellingShingle Tao Xue
Yixuan Zheng
Dan Tong
Bo Zheng
Xin Li
Tong Zhu
Qiang Zhang
Spatiotemporal continuous estimates of PM2.5 concentrations in China, 2000–2016: A machine learning method with inputs from satellites, chemical transport model, and ground observations
Environment International
author_facet Tao Xue
Yixuan Zheng
Dan Tong
Bo Zheng
Xin Li
Tong Zhu
Qiang Zhang
author_sort Tao Xue
title Spatiotemporal continuous estimates of PM2.5 concentrations in China, 2000–2016: A machine learning method with inputs from satellites, chemical transport model, and ground observations
title_short Spatiotemporal continuous estimates of PM2.5 concentrations in China, 2000–2016: A machine learning method with inputs from satellites, chemical transport model, and ground observations
title_full Spatiotemporal continuous estimates of PM2.5 concentrations in China, 2000–2016: A machine learning method with inputs from satellites, chemical transport model, and ground observations
title_fullStr Spatiotemporal continuous estimates of PM2.5 concentrations in China, 2000–2016: A machine learning method with inputs from satellites, chemical transport model, and ground observations
title_full_unstemmed Spatiotemporal continuous estimates of PM2.5 concentrations in China, 2000–2016: A machine learning method with inputs from satellites, chemical transport model, and ground observations
title_sort spatiotemporal continuous estimates of pm2.5 concentrations in china, 2000–2016: a machine learning method with inputs from satellites, chemical transport model, and ground observations
publisher Elsevier
series Environment International
issn 0160-4120
publishDate 2019-02-01
description Ambient exposure to fine particulate matter (PM2.5) is known to harm public health in China. Satellite remote sensing measurements of aerosol optical depth (AOD) were statistically associated with in-situ observations after 2013 to predict PM2.5 concentrations nationwide, while the lack of surface monitoring data before 2013 have created difficulties in historical PM2.5 exposure estimates. Hindcast approaches using statistical models or chemical transport models (CTMs) were developed to overcome this limitation, while those approaches still suffer from incomplete daily coverage due to missing AOD data or limited accuracy due to uncertainties of CTMs. Here we developed a new machine learning (ML) model with high-dimensional expansion (HD-expansion) of numerous predictors (including AOD and other satellite covariates, meteorological variables and CTM simulations). Through comprehensive characterization of the nonlinear effects of, and interactions among different predictors, the HD-expansion parameterized the association between PM2.5 and AOD as a nonlinear function of space and time covariates (e.g., planetary boundary layer height and relative humidity). In this way, the PM2.5-AOD association can vary spatiotemporally. We trained the model with data from 2013 to 2016 and evaluated its performance using annually-iterated cross-validation, which iteratively held out the in-situ observations for a whole calendar year (as testing data) to examine the predictions from a model trained by the rest of the observations. Our estimates were found to be in good agreement with in-situ observations, with correlation coefficients (R2) of 0.61, 0.68, and 0.75 for daily, monthly and annual averages, respectively. To interpolate the missing predictions due to incomplete AOD data, we incorporated a generalized additive model into the ML model. The two-stage estimates of PM2.5 sacrificed the prediction accuracy on a daily timescale (R2 = 0.55), but achieved complete spatiotemporal coverage and improved the accuracy of monthly (R2 = 0.71) and annual (R2 = 0.77) averages. The model was then used to predict daily PM2.5 concentrations during 2000–2016 across China and estimate long-term trends in PM2.5 for the period. We found that population-weighted concentrations of PM2.5 significantly increased, by 2.10 (95% confidence interval (CI): 1.74, 2.46) μg/m3/year during 2000–2007, and rapidly decreased by 4.51 (3.12, 5.90) μg/m3/year during 2013–2016. In this study, we produced AOD-based estimates of historical PM2.5 with complete spatiotemporal coverage, which were evidenced as accurate, particularly in middle and long term. The products could support large-scale epidemiological studies and risk assessments of ambient PM2.5 in China and can be accessed via the website (http://www.meicmodel.org/dataset-phd.html). Keywords: Fine particulate matter, Satellite remote sensing, Aerosol optical depth, Machine learning
url http://www.sciencedirect.com/science/article/pii/S0160412018316623
work_keys_str_mv AT taoxue spatiotemporalcontinuousestimatesofpm25concentrationsinchina20002016amachinelearningmethodwithinputsfromsatelliteschemicaltransportmodelandgroundobservations
AT yixuanzheng spatiotemporalcontinuousestimatesofpm25concentrationsinchina20002016amachinelearningmethodwithinputsfromsatelliteschemicaltransportmodelandgroundobservations
AT dantong spatiotemporalcontinuousestimatesofpm25concentrationsinchina20002016amachinelearningmethodwithinputsfromsatelliteschemicaltransportmodelandgroundobservations
AT bozheng spatiotemporalcontinuousestimatesofpm25concentrationsinchina20002016amachinelearningmethodwithinputsfromsatelliteschemicaltransportmodelandgroundobservations
AT xinli spatiotemporalcontinuousestimatesofpm25concentrationsinchina20002016amachinelearningmethodwithinputsfromsatelliteschemicaltransportmodelandgroundobservations
AT tongzhu spatiotemporalcontinuousestimatesofpm25concentrationsinchina20002016amachinelearningmethodwithinputsfromsatelliteschemicaltransportmodelandgroundobservations
AT qiangzhang spatiotemporalcontinuousestimatesofpm25concentrationsinchina20002016amachinelearningmethodwithinputsfromsatelliteschemicaltransportmodelandgroundobservations
_version_ 1724732238135820288
spelling doaj-94641564e67242db8a9245f403199a0a2020-11-25T02:51:58ZengElsevierEnvironment International0160-41202019-02-01123345357Spatiotemporal continuous estimates of PM2.5 concentrations in China, 2000–2016: A machine learning method with inputs from satellites, chemical transport model, and ground observationsTao Xue0Yixuan Zheng1Dan Tong2Bo Zheng3Xin Li4Tong Zhu5Qiang Zhang6BIC-ESAT and SKL-ESPC, College of Environmental Science and Engineering, Peking University, Beijing 100871, China; Department of Earth System Science, Tsinghua University, Beijing 100084, ChinaDepartment of Earth System Science, Tsinghua University, Beijing 100084, ChinaDepartment of Earth System Science, Tsinghua University, Beijing 100084, ChinaState Key Joint Laboratory of Environment Simulation and Pollution Control, School of Environment, Tsinghua University, Beijing 100084, ChinaDepartment of Earth System Science, Tsinghua University, Beijing 100084, ChinaBIC-ESAT and SKL-ESPC, College of Environmental Science and Engineering, Peking University, Beijing 100871, ChinaDepartment of Earth System Science, Tsinghua University, Beijing 100084, China; Corresponding author at: Mengminwei Science and Technology Building, Tsinghua University, Haidian, Beijing 100084, China.Ambient exposure to fine particulate matter (PM2.5) is known to harm public health in China. Satellite remote sensing measurements of aerosol optical depth (AOD) were statistically associated with in-situ observations after 2013 to predict PM2.5 concentrations nationwide, while the lack of surface monitoring data before 2013 have created difficulties in historical PM2.5 exposure estimates. Hindcast approaches using statistical models or chemical transport models (CTMs) were developed to overcome this limitation, while those approaches still suffer from incomplete daily coverage due to missing AOD data or limited accuracy due to uncertainties of CTMs. Here we developed a new machine learning (ML) model with high-dimensional expansion (HD-expansion) of numerous predictors (including AOD and other satellite covariates, meteorological variables and CTM simulations). Through comprehensive characterization of the nonlinear effects of, and interactions among different predictors, the HD-expansion parameterized the association between PM2.5 and AOD as a nonlinear function of space and time covariates (e.g., planetary boundary layer height and relative humidity). In this way, the PM2.5-AOD association can vary spatiotemporally. We trained the model with data from 2013 to 2016 and evaluated its performance using annually-iterated cross-validation, which iteratively held out the in-situ observations for a whole calendar year (as testing data) to examine the predictions from a model trained by the rest of the observations. Our estimates were found to be in good agreement with in-situ observations, with correlation coefficients (R2) of 0.61, 0.68, and 0.75 for daily, monthly and annual averages, respectively. To interpolate the missing predictions due to incomplete AOD data, we incorporated a generalized additive model into the ML model. The two-stage estimates of PM2.5 sacrificed the prediction accuracy on a daily timescale (R2 = 0.55), but achieved complete spatiotemporal coverage and improved the accuracy of monthly (R2 = 0.71) and annual (R2 = 0.77) averages. The model was then used to predict daily PM2.5 concentrations during 2000–2016 across China and estimate long-term trends in PM2.5 for the period. We found that population-weighted concentrations of PM2.5 significantly increased, by 2.10 (95% confidence interval (CI): 1.74, 2.46) μg/m3/year during 2000–2007, and rapidly decreased by 4.51 (3.12, 5.90) μg/m3/year during 2013–2016. In this study, we produced AOD-based estimates of historical PM2.5 with complete spatiotemporal coverage, which were evidenced as accurate, particularly in middle and long term. The products could support large-scale epidemiological studies and risk assessments of ambient PM2.5 in China and can be accessed via the website (http://www.meicmodel.org/dataset-phd.html). Keywords: Fine particulate matter, Satellite remote sensing, Aerosol optical depth, Machine learninghttp://www.sciencedirect.com/science/article/pii/S0160412018316623