Lag Variables in Nitrogen Oxide Concentration Modelling: A Case Study in Wrocław, Poland
Due to the unwavering interest of both residents and authorities in the air quality of urban agglomerations, we pose the following question in this paper: What impact do current and past meteorological factors and traffic flow intensity have on air quality? What is the impact of lagged variables on...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-11-01
|
Series: | Atmosphere |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-4433/11/12/1293 |
id |
doaj-24a9bf61b1bf48d58f9604a0f9f18597 |
---|---|
record_format |
Article |
spelling |
doaj-24a9bf61b1bf48d58f9604a0f9f185972020-12-01T00:01:22ZengMDPI AGAtmosphere2073-44332020-11-01111293129310.3390/atmos11121293Lag Variables in Nitrogen Oxide Concentration Modelling: A Case Study in Wrocław, PolandJoanna A. Kamińska0Fernando Jiménez1Estrella Lucena-Sánchez2Guido Sciavicco3Tomasz Turek4Department of Mathematics, Wroclaw University of Environmental and Life Sciences, 50-375 Wrocław, PolandDepartment of Information and Communication Engineering, University of Murcia, 30100 Murcia, SpainDepartment of Mathematics and Computer Science, University of Ferrara, 44121 Ferrara, ItalyDepartment of Mathematics and Computer Science, University of Ferrara, 44121 Ferrara, ItalyDepartment of Mathematics, Wroclaw University of Environmental and Life Sciences, 50-375 Wrocław, PolandDue to the unwavering interest of both residents and authorities in the air quality of urban agglomerations, we pose the following question in this paper: What impact do current and past meteorological factors and traffic flow intensity have on air quality? What is the impact of lagged variables on the fit of an explanation model, and how do they affect its ability to predict? We focused on NO<sub>2</sub> and NO<sub>x</sub> concentrations, and conducted this research using hourly data from the city of Wrocław (western Poland) from 2015 to 2017; we used multi-objective optimization to determine the optimal delays. It turned out that for both NO<sub>2</sub> and NO<sub>x</sub>, the past values for traffic flow, wind speed, and sunshine duration are more important than the current ones. We built random forest models on each of the pollutants for both the current and past values and discovered that including a lagged variable increases the resulting R<sup>2</sup> from 0.51 to 0.56 for NO<sub>2</sub> and from 0.46 to 0.52 for NO<sub>x</sub>. We also analyzed the feature importance in each model, and found that for NO<sub>2</sub>, a wind speed delay of more than three hours causes a significant decrease, while the importance of relative humidity increases with a seven-hour delay; likewise, wind speed increases the importance for NO<sub>x</sub> prediction with a two-hour delay. We concluded that, in pollutant concentration modeling, the possibility of a delayed effect of the independent variables should always be considered, because it can significantly increase the performance of the model and suggest unexpected relationships or dependencies.https://www.mdpi.com/2073-4433/11/12/1293air pollutionnitrogen oxidesrandom forestlag variablesmulti-objective optimizationtraffic flow |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Joanna A. Kamińska Fernando Jiménez Estrella Lucena-Sánchez Guido Sciavicco Tomasz Turek |
spellingShingle |
Joanna A. Kamińska Fernando Jiménez Estrella Lucena-Sánchez Guido Sciavicco Tomasz Turek Lag Variables in Nitrogen Oxide Concentration Modelling: A Case Study in Wrocław, Poland Atmosphere air pollution nitrogen oxides random forest lag variables multi-objective optimization traffic flow |
author_facet |
Joanna A. Kamińska Fernando Jiménez Estrella Lucena-Sánchez Guido Sciavicco Tomasz Turek |
author_sort |
Joanna A. Kamińska |
title |
Lag Variables in Nitrogen Oxide Concentration Modelling: A Case Study in Wrocław, Poland |
title_short |
Lag Variables in Nitrogen Oxide Concentration Modelling: A Case Study in Wrocław, Poland |
title_full |
Lag Variables in Nitrogen Oxide Concentration Modelling: A Case Study in Wrocław, Poland |
title_fullStr |
Lag Variables in Nitrogen Oxide Concentration Modelling: A Case Study in Wrocław, Poland |
title_full_unstemmed |
Lag Variables in Nitrogen Oxide Concentration Modelling: A Case Study in Wrocław, Poland |
title_sort |
lag variables in nitrogen oxide concentration modelling: a case study in wrocław, poland |
publisher |
MDPI AG |
series |
Atmosphere |
issn |
2073-4433 |
publishDate |
2020-11-01 |
description |
Due to the unwavering interest of both residents and authorities in the air quality of urban agglomerations, we pose the following question in this paper: What impact do current and past meteorological factors and traffic flow intensity have on air quality? What is the impact of lagged variables on the fit of an explanation model, and how do they affect its ability to predict? We focused on NO<sub>2</sub> and NO<sub>x</sub> concentrations, and conducted this research using hourly data from the city of Wrocław (western Poland) from 2015 to 2017; we used multi-objective optimization to determine the optimal delays. It turned out that for both NO<sub>2</sub> and NO<sub>x</sub>, the past values for traffic flow, wind speed, and sunshine duration are more important than the current ones. We built random forest models on each of the pollutants for both the current and past values and discovered that including a lagged variable increases the resulting R<sup>2</sup> from 0.51 to 0.56 for NO<sub>2</sub> and from 0.46 to 0.52 for NO<sub>x</sub>. We also analyzed the feature importance in each model, and found that for NO<sub>2</sub>, a wind speed delay of more than three hours causes a significant decrease, while the importance of relative humidity increases with a seven-hour delay; likewise, wind speed increases the importance for NO<sub>x</sub> prediction with a two-hour delay. We concluded that, in pollutant concentration modeling, the possibility of a delayed effect of the independent variables should always be considered, because it can significantly increase the performance of the model and suggest unexpected relationships or dependencies. |
topic |
air pollution nitrogen oxides random forest lag variables multi-objective optimization traffic flow |
url |
https://www.mdpi.com/2073-4433/11/12/1293 |
work_keys_str_mv |
AT joannaakaminska lagvariablesinnitrogenoxideconcentrationmodellingacasestudyinwrocławpoland AT fernandojimenez lagvariablesinnitrogenoxideconcentrationmodellingacasestudyinwrocławpoland AT estrellalucenasanchez lagvariablesinnitrogenoxideconcentrationmodellingacasestudyinwrocławpoland AT guidosciavicco lagvariablesinnitrogenoxideconcentrationmodellingacasestudyinwrocławpoland AT tomaszturek lagvariablesinnitrogenoxideconcentrationmodellingacasestudyinwrocławpoland |
_version_ |
1724411384033181696 |