Data imputation in in situ-measured particle size distributions by means of neural networks
<p><span id="page5536"/>In air quality research, often only size-integrated particle mass concentrations as indicators of aerosol particles are considered. However, the mass concentrations do not provide sufficient information to convey the full story of fractionated size distr...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Copernicus Publications
2021-08-01
|
Series: | Atmospheric Measurement Techniques |
Online Access: | https://amt.copernicus.org/articles/14/5535/2021/amt-14-5535-2021.pdf |
id |
doaj-286050e6405341048cd49d7ec0eda248 |
---|---|
record_format |
Article |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
P. L. Fung P. L. Fung M. A. Zaidan M. A. Zaidan M. A. Zaidan O. Surakhi S. Tarkoma T. Petäjä T. Petäjä T. Hussein T. Hussein |
spellingShingle |
P. L. Fung P. L. Fung M. A. Zaidan M. A. Zaidan M. A. Zaidan O. Surakhi S. Tarkoma T. Petäjä T. Petäjä T. Hussein T. Hussein Data imputation in in situ-measured particle size distributions by means of neural networks Atmospheric Measurement Techniques |
author_facet |
P. L. Fung P. L. Fung M. A. Zaidan M. A. Zaidan M. A. Zaidan O. Surakhi S. Tarkoma T. Petäjä T. Petäjä T. Hussein T. Hussein |
author_sort |
P. L. Fung |
title |
Data imputation in in situ-measured particle size distributions by means of neural networks |
title_short |
Data imputation in in situ-measured particle size distributions by means of neural networks |
title_full |
Data imputation in in situ-measured particle size distributions by means of neural networks |
title_fullStr |
Data imputation in in situ-measured particle size distributions by means of neural networks |
title_full_unstemmed |
Data imputation in in situ-measured particle size distributions by means of neural networks |
title_sort |
data imputation in in situ-measured particle size distributions by means of neural networks |
publisher |
Copernicus Publications |
series |
Atmospheric Measurement Techniques |
issn |
1867-1381 1867-8548 |
publishDate |
2021-08-01 |
description |
<p><span id="page5536"/>In air quality research, often only size-integrated particle mass
concentrations as indicators of aerosol particles are considered. However,
the mass concentrations do not provide sufficient information to convey the
full story of fractionated size distribution, in which the particles of
different diameters (<span class="inline-formula"><i>D</i><sub>p</sub></span>) are able to deposit differently on respiratory system and cause various harm. Aerosol size distribution measurements rely on a variety of techniques to classify the aerosol size and measure the size distribution. From the raw data the ambient size distribution is determined utilising a suite of inversion algorithms. However, the inversion problem is quite often ill-posed and challenging to solve. Due to the instrumental insufficiency and inversion limitations, imputation methods for fractionated particle size distribution are of great significance to fill the missing gaps or negative values. The study at hand involves a merged particle size distribution, from a scanning mobility particle sizer (NanoSMPS) and an optical particle sizer (OPS) covering the aerosol size distributions from 0.01 to 0.42 <span class="inline-formula">µm</span> (electrical mobility equivalent size) and 0.3 to 10 <span class="inline-formula">µm</span> (optical equivalent size) and meteorological parameters collected at an urban background region in Amman, Jordan, in the period of 1 August 2016–31 July 2017. We develop and evaluate feed-forward neural network (FFNN) approaches to estimate number concentrations at particular size bin with (1) meteorological parameters, (2) number concentration at other size
bins and (3) both of the above as input variables. Two layers with 10–15
neurons are found to be the optimal option. Worse performance is observed at
the lower edge (<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M4" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">0.01</mn><mo><</mo><msub><mi>D</mi><mi mathvariant="normal">p</mi></msub><mo><</mo><mn mathvariant="normal">0.02</mn></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="83pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="0a19d858f772fceabb7358981286c2af"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="amt-14-5535-2021-ie00001.svg" width="83pt" height="14pt" src="amt-14-5535-2021-ie00001.png"/></svg:svg></span></span> <span class="inline-formula">µm</span>), the mid-range
region (<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M6" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">0.15</mn><mo><</mo><msub><mi>D</mi><mi mathvariant="normal">p</mi></msub><mo><</mo><mn mathvariant="normal">0.5</mn></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="77pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="4c842197885cdc4dc6daf89731acd19e"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="amt-14-5535-2021-ie00002.svg" width="77pt" height="14pt" src="amt-14-5535-2021-ie00002.png"/></svg:svg></span></span> <span class="inline-formula">µm</span>) and the upper edge
(<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M8" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">6</mn><mo><</mo><msub><mi>D</mi><mi mathvariant="normal">p</mi></msub><mo><</mo><mn mathvariant="normal">10</mn></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="58pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="a7eb375b62532835fdc23b1be88eb5ad"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="amt-14-5535-2021-ie00003.svg" width="58pt" height="14pt" src="amt-14-5535-2021-ie00003.png"/></svg:svg></span></span> <span class="inline-formula">µm</span>). For the edges at both ends, the
number of neighbouring size bins is limited, and the detection efficiency by
the corresponding instruments is lower compared to the other size bins. A
distinct performance drop over the overlapping mid-range region is due to
the deficiency of a merging algorithm. Another plausible reason for the
poorer performance for finer particles is that they are more effectively
removed from the atmosphere compared to the coarser particles so that the
relationships between the input variables and the small particles are more
dynamic. An observable overestimation is also found in the early morning for
ultrafine particles followed by a distinct underestimation before midday. In
the winter, due to a possible sensor drift and interference artefacts, the
estimation performance is not as good as the other seasons. The FFNN
approach by meteorological parameters using 5 min data (<span class="inline-formula"><i>R</i><sup>2</sup>=</span> 0.22–0.58) shows poorer results than data with longer time resolution
(<span class="inline-formula"><i>R</i><sup>2</sup>=</span> 0.66–0.77). The FFNN approach using the number concentration at
the other size bins can serve as an alternative way to replace negative
numbers in the size distribution raw dataset thanks to its high accuracy and
reliability (<span class="inline-formula"><i>R</i><sup>2</sup>=</span> 0.97–1). This negative-number filling approach
can maintain a symmetric distribution of errors and complement the existing
ill-posed built-in algorithm in particle sizer instruments.</p> |
url |
https://amt.copernicus.org/articles/14/5535/2021/amt-14-5535-2021.pdf |
work_keys_str_mv |
AT plfung dataimputationininsitumeasuredparticlesizedistributionsbymeansofneuralnetworks AT plfung dataimputationininsitumeasuredparticlesizedistributionsbymeansofneuralnetworks AT mazaidan dataimputationininsitumeasuredparticlesizedistributionsbymeansofneuralnetworks AT mazaidan dataimputationininsitumeasuredparticlesizedistributionsbymeansofneuralnetworks AT mazaidan dataimputationininsitumeasuredparticlesizedistributionsbymeansofneuralnetworks AT osurakhi dataimputationininsitumeasuredparticlesizedistributionsbymeansofneuralnetworks AT starkoma dataimputationininsitumeasuredparticlesizedistributionsbymeansofneuralnetworks AT tpetaja dataimputationininsitumeasuredparticlesizedistributionsbymeansofneuralnetworks AT tpetaja dataimputationininsitumeasuredparticlesizedistributionsbymeansofneuralnetworks AT thussein dataimputationininsitumeasuredparticlesizedistributionsbymeansofneuralnetworks AT thussein dataimputationininsitumeasuredparticlesizedistributionsbymeansofneuralnetworks |
_version_ |
1721208405716107264 |
spelling |
doaj-286050e6405341048cd49d7ec0eda2482021-08-13T12:15:16ZengCopernicus PublicationsAtmospheric Measurement Techniques1867-13811867-85482021-08-01145535555410.5194/amt-14-5535-2021Data imputation in in situ-measured particle size distributions by means of neural networksP. L. Fung0P. L. Fung1M. A. Zaidan2M. A. Zaidan3M. A. Zaidan4O. Surakhi5S. Tarkoma6T. Petäjä7T. Petäjä8T. Hussein9T. Hussein10Institute for Atmospheric and Earth System Research/Physics, Faculty of Science, University of Helsinki, 00140 Helsinki, FinlandHelsinki Institute of Sustainability Science, Faculty of Science, University of Helsinki, 00140 Helsinki, FinlandInstitute for Atmospheric and Earth System Research/Physics, Faculty of Science, University of Helsinki, 00140 Helsinki, FinlandHelsinki Institute of Sustainability Science, Faculty of Science, University of Helsinki, 00140 Helsinki, FinlandJoint International Research Laboratory of Atmospheric and Earth System Sciences, School of Atmospheric Sciences, Nanjing University, Nanjing 210023, ChinaDepartment of Computer Science, The University of Jordan, Amman 11942, JordanDepartment of Computer Science, Faculty of Science, University of Helsinki, 00140 Helsinki, FinlandInstitute for Atmospheric and Earth System Research/Physics, Faculty of Science, University of Helsinki, 00140 Helsinki, FinlandJoint International Research Laboratory of Atmospheric and Earth System Sciences, School of Atmospheric Sciences, Nanjing University, Nanjing 210023, ChinaInstitute for Atmospheric and Earth System Research/Physics, Faculty of Science, University of Helsinki, 00140 Helsinki, FinlandDepartment of Physics, The University of Jordan, Amman 11942, Jordan<p><span id="page5536"/>In air quality research, often only size-integrated particle mass concentrations as indicators of aerosol particles are considered. However, the mass concentrations do not provide sufficient information to convey the full story of fractionated size distribution, in which the particles of different diameters (<span class="inline-formula"><i>D</i><sub>p</sub></span>) are able to deposit differently on respiratory system and cause various harm. Aerosol size distribution measurements rely on a variety of techniques to classify the aerosol size and measure the size distribution. From the raw data the ambient size distribution is determined utilising a suite of inversion algorithms. However, the inversion problem is quite often ill-posed and challenging to solve. Due to the instrumental insufficiency and inversion limitations, imputation methods for fractionated particle size distribution are of great significance to fill the missing gaps or negative values. The study at hand involves a merged particle size distribution, from a scanning mobility particle sizer (NanoSMPS) and an optical particle sizer (OPS) covering the aerosol size distributions from 0.01 to 0.42 <span class="inline-formula">µm</span> (electrical mobility equivalent size) and 0.3 to 10 <span class="inline-formula">µm</span> (optical equivalent size) and meteorological parameters collected at an urban background region in Amman, Jordan, in the period of 1 August 2016–31 July 2017. We develop and evaluate feed-forward neural network (FFNN) approaches to estimate number concentrations at particular size bin with (1) meteorological parameters, (2) number concentration at other size bins and (3) both of the above as input variables. Two layers with 10–15 neurons are found to be the optimal option. Worse performance is observed at the lower edge (<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M4" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">0.01</mn><mo><</mo><msub><mi>D</mi><mi mathvariant="normal">p</mi></msub><mo><</mo><mn mathvariant="normal">0.02</mn></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="83pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="0a19d858f772fceabb7358981286c2af"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="amt-14-5535-2021-ie00001.svg" width="83pt" height="14pt" src="amt-14-5535-2021-ie00001.png"/></svg:svg></span></span> <span class="inline-formula">µm</span>), the mid-range region (<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M6" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">0.15</mn><mo><</mo><msub><mi>D</mi><mi mathvariant="normal">p</mi></msub><mo><</mo><mn mathvariant="normal">0.5</mn></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="77pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="4c842197885cdc4dc6daf89731acd19e"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="amt-14-5535-2021-ie00002.svg" width="77pt" height="14pt" src="amt-14-5535-2021-ie00002.png"/></svg:svg></span></span> <span class="inline-formula">µm</span>) and the upper edge (<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M8" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">6</mn><mo><</mo><msub><mi>D</mi><mi mathvariant="normal">p</mi></msub><mo><</mo><mn mathvariant="normal">10</mn></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="58pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="a7eb375b62532835fdc23b1be88eb5ad"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="amt-14-5535-2021-ie00003.svg" width="58pt" height="14pt" src="amt-14-5535-2021-ie00003.png"/></svg:svg></span></span> <span class="inline-formula">µm</span>). For the edges at both ends, the number of neighbouring size bins is limited, and the detection efficiency by the corresponding instruments is lower compared to the other size bins. A distinct performance drop over the overlapping mid-range region is due to the deficiency of a merging algorithm. Another plausible reason for the poorer performance for finer particles is that they are more effectively removed from the atmosphere compared to the coarser particles so that the relationships between the input variables and the small particles are more dynamic. An observable overestimation is also found in the early morning for ultrafine particles followed by a distinct underestimation before midday. In the winter, due to a possible sensor drift and interference artefacts, the estimation performance is not as good as the other seasons. The FFNN approach by meteorological parameters using 5 min data (<span class="inline-formula"><i>R</i><sup>2</sup>=</span> 0.22–0.58) shows poorer results than data with longer time resolution (<span class="inline-formula"><i>R</i><sup>2</sup>=</span> 0.66–0.77). The FFNN approach using the number concentration at the other size bins can serve as an alternative way to replace negative numbers in the size distribution raw dataset thanks to its high accuracy and reliability (<span class="inline-formula"><i>R</i><sup>2</sup>=</span> 0.97–1). This negative-number filling approach can maintain a symmetric distribution of errors and complement the existing ill-posed built-in algorithm in particle sizer instruments.</p>https://amt.copernicus.org/articles/14/5535/2021/amt-14-5535-2021.pdf |