Identification of new particle formation events with deep learning
<p>New particle formation (NPF) in the atmosphere is globally an important source of climate relevant aerosol particles. Occurrence of NPF events is typically analyzed by researchers manually from particle size distribution data day by day, which is time consuming and the classification of...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Copernicus Publications
2018-07-01
|
Series: | Atmospheric Chemistry and Physics |
Online Access: | https://www.atmos-chem-phys.net/18/9597/2018/acp-18-9597-2018.pdf |
id |
doaj-5ad20a0b2a0d46ddb7895dd310eab6dc |
---|---|
record_format |
Article |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
J. Joutsensaari M. Ozon T. Nieminen S. Mikkonen T. Lähivaara S. Decesari M. C. Facchini A. Laaksonen A. Laaksonen K. E. J. Lehtinen K. E. J. Lehtinen |
spellingShingle |
J. Joutsensaari M. Ozon T. Nieminen S. Mikkonen T. Lähivaara S. Decesari M. C. Facchini A. Laaksonen A. Laaksonen K. E. J. Lehtinen K. E. J. Lehtinen Identification of new particle formation events with deep learning Atmospheric Chemistry and Physics |
author_facet |
J. Joutsensaari M. Ozon T. Nieminen S. Mikkonen T. Lähivaara S. Decesari M. C. Facchini A. Laaksonen A. Laaksonen K. E. J. Lehtinen K. E. J. Lehtinen |
author_sort |
J. Joutsensaari |
title |
Identification of new particle formation events with deep learning |
title_short |
Identification of new particle formation events with deep learning |
title_full |
Identification of new particle formation events with deep learning |
title_fullStr |
Identification of new particle formation events with deep learning |
title_full_unstemmed |
Identification of new particle formation events with deep learning |
title_sort |
identification of new particle formation events with deep learning |
publisher |
Copernicus Publications |
series |
Atmospheric Chemistry and Physics |
issn |
1680-7316 1680-7324 |
publishDate |
2018-07-01 |
description |
<p>New particle formation (NPF) in the atmosphere is globally an
important source of climate relevant aerosol particles. Occurrence of NPF
events is typically analyzed by researchers manually from particle size
distribution data day by day, which is time consuming and the classification
of event types may be inconsistent. To get more reliable and consistent
results, the NPF event analysis should be automatized. We have developed an
automatic analysis method based on deep learning, a subarea of machine
learning, for NPF event identification. To our knowledge, this is the first
time that a deep learning method, i.e., transfer learning of a convolutional
neural network (CNN), has successfully been used to automatically classify
NPF events into different classes directly from particle size distribution
images, similarly to how the researchers carry out the manual classification. The
developed method is based on image analysis of particle size distributions
using a pretrained deep CNN, named AlexNet, which was transfer learned to
recognize NPF event classes (six different types). In transfer learning, a
partial set of particle size distribution images was used in the training
stage of the CNN and the rest of the images for testing the success of the
training. The method was utilized for a 15-year-long dataset measured at San
Pietro Capofiume (SPC) in Italy. We studied the performance of the training
with different training and testing of image number ratios as well as with
different regions of interest in the images. The results show that clear
event (i.e., classes 1 and 2) and nonevent days can be identified with an
accuracy of ca. 80 %, when the CNN classification is compared with that
of an expert, which is a good first result for automatic NPF event analysis.
In the event classification, the choice between different event classes is
not an easy task even for trained researchers, and thus overlapping or confusion
between different classes occurs. Hence, we cross-validated the learning
results of CNN with the expert-made classification. The results show that the
overlapping occurs, typically between the adjacent or similar type of classes,
e.g., a manually classified Class 1 is categorized mainly into classes 1 and
2 by CNN, indicating that the manual and CNN classifications are very
consistent
for most of the days. The classification would be more consistent, by
both human and CNN, if only two different classes are used for event days
instead of three classes. Thus, we recommend that in the future analysis,
event days should be categorized into classes of <q>quantifiable</q> (i.e., clear
events, classes 1 and 2) and <q>nonquantifiable</q> (i.e., weak events, Class
3). This would better describe the difference of those classes: both
formation and growth rates can be determined for quantifiable days but not
both for nonquantifiable days. Furthermore, we investigated more deeply the
days that are classified as clear events by experts and recognized as
nonevents by the CNN and vice versa. Clear misclassifications seem to occur
more commonly in manual analysis than in the CNN categorization, which is
mostly due to the inconsistency in the human-made classification or errors in
the booking of the event class. In general, the automatic CNN classifier has
a better reliability and repeatability in NPF event classification than
human-made classification and, thus, the transfer-learned pretrained CNNs
are powerful tools to analyze long-term datasets. The developed NPF event
classifier can be easily utilized to analyze any long-term datasets more
accurately and consistently, which helps us to understand in detail
aerosol–climate interactions and the long-term effects of climate change on
NPF in the atmosphere. We encourage researchers to use the model in other
sites. However, we suggest that the CNN should be transfer learned again for
new site data with a minimum of ca. 150 figures per class to obtain good
enough classification results, especially if the size distribution evolution
differs from training data. In the future, we will utilize the method for
data from other sites, develop it to analyze more parameters and evaluate how
successfully CNN could be trained with synthetic NPF event data.</p> |
url |
https://www.atmos-chem-phys.net/18/9597/2018/acp-18-9597-2018.pdf |
work_keys_str_mv |
AT jjoutsensaari identificationofnewparticleformationeventswithdeeplearning AT mozon identificationofnewparticleformationeventswithdeeplearning AT tnieminen identificationofnewparticleformationeventswithdeeplearning AT smikkonen identificationofnewparticleformationeventswithdeeplearning AT tlahivaara identificationofnewparticleformationeventswithdeeplearning AT sdecesari identificationofnewparticleformationeventswithdeeplearning AT mcfacchini identificationofnewparticleformationeventswithdeeplearning AT alaaksonen identificationofnewparticleformationeventswithdeeplearning AT alaaksonen identificationofnewparticleformationeventswithdeeplearning AT kejlehtinen identificationofnewparticleformationeventswithdeeplearning AT kejlehtinen identificationofnewparticleformationeventswithdeeplearning |
_version_ |
1726016727152590848 |
spelling |
doaj-5ad20a0b2a0d46ddb7895dd310eab6dc2020-11-24T21:16:10ZengCopernicus PublicationsAtmospheric Chemistry and Physics1680-73161680-73242018-07-01189597961510.5194/acp-18-9597-2018Identification of new particle formation events with deep learningJ. Joutsensaari0M. Ozon1T. Nieminen2S. Mikkonen3T. Lähivaara4S. Decesari5M. C. Facchini6A. Laaksonen7A. Laaksonen8K. E. J. Lehtinen9K. E. J. Lehtinen10Department of Applied Physics, University of Eastern Finland, P.O. Box 1627, 70211 Kuopio, FinlandDepartment of Applied Physics, University of Eastern Finland, P.O. Box 1627, 70211 Kuopio, FinlandDepartment of Applied Physics, University of Eastern Finland, P.O. Box 1627, 70211 Kuopio, FinlandDepartment of Applied Physics, University of Eastern Finland, P.O. Box 1627, 70211 Kuopio, FinlandDepartment of Applied Physics, University of Eastern Finland, P.O. Box 1627, 70211 Kuopio, FinlandInstitute of Atmospheric Sciences and Climate of the Italian National Research Council, Bologna, ItalyInstitute of Atmospheric Sciences and Climate of the Italian National Research Council, Bologna, ItalyDepartment of Applied Physics, University of Eastern Finland, P.O. Box 1627, 70211 Kuopio, FinlandClimate research Unit, Finnish Meteorological Institute, Helsinki, FinlandDepartment of Applied Physics, University of Eastern Finland, P.O. Box 1627, 70211 Kuopio, FinlandAtmospheric Research Centre of Eastern Finland, Finnish Meteorological Institute, Kuopio, Finland<p>New particle formation (NPF) in the atmosphere is globally an important source of climate relevant aerosol particles. Occurrence of NPF events is typically analyzed by researchers manually from particle size distribution data day by day, which is time consuming and the classification of event types may be inconsistent. To get more reliable and consistent results, the NPF event analysis should be automatized. We have developed an automatic analysis method based on deep learning, a subarea of machine learning, for NPF event identification. To our knowledge, this is the first time that a deep learning method, i.e., transfer learning of a convolutional neural network (CNN), has successfully been used to automatically classify NPF events into different classes directly from particle size distribution images, similarly to how the researchers carry out the manual classification. The developed method is based on image analysis of particle size distributions using a pretrained deep CNN, named AlexNet, which was transfer learned to recognize NPF event classes (six different types). In transfer learning, a partial set of particle size distribution images was used in the training stage of the CNN and the rest of the images for testing the success of the training. The method was utilized for a 15-year-long dataset measured at San Pietro Capofiume (SPC) in Italy. We studied the performance of the training with different training and testing of image number ratios as well as with different regions of interest in the images. The results show that clear event (i.e., classes 1 and 2) and nonevent days can be identified with an accuracy of ca. 80 %, when the CNN classification is compared with that of an expert, which is a good first result for automatic NPF event analysis. In the event classification, the choice between different event classes is not an easy task even for trained researchers, and thus overlapping or confusion between different classes occurs. Hence, we cross-validated the learning results of CNN with the expert-made classification. The results show that the overlapping occurs, typically between the adjacent or similar type of classes, e.g., a manually classified Class 1 is categorized mainly into classes 1 and 2 by CNN, indicating that the manual and CNN classifications are very consistent for most of the days. The classification would be more consistent, by both human and CNN, if only two different classes are used for event days instead of three classes. Thus, we recommend that in the future analysis, event days should be categorized into classes of <q>quantifiable</q> (i.e., clear events, classes 1 and 2) and <q>nonquantifiable</q> (i.e., weak events, Class 3). This would better describe the difference of those classes: both formation and growth rates can be determined for quantifiable days but not both for nonquantifiable days. Furthermore, we investigated more deeply the days that are classified as clear events by experts and recognized as nonevents by the CNN and vice versa. Clear misclassifications seem to occur more commonly in manual analysis than in the CNN categorization, which is mostly due to the inconsistency in the human-made classification or errors in the booking of the event class. In general, the automatic CNN classifier has a better reliability and repeatability in NPF event classification than human-made classification and, thus, the transfer-learned pretrained CNNs are powerful tools to analyze long-term datasets. The developed NPF event classifier can be easily utilized to analyze any long-term datasets more accurately and consistently, which helps us to understand in detail aerosol–climate interactions and the long-term effects of climate change on NPF in the atmosphere. We encourage researchers to use the model in other sites. However, we suggest that the CNN should be transfer learned again for new site data with a minimum of ca. 150 figures per class to obtain good enough classification results, especially if the size distribution evolution differs from training data. In the future, we will utilize the method for data from other sites, develop it to analyze more parameters and evaluate how successfully CNN could be trained with synthetic NPF event data.</p>https://www.atmos-chem-phys.net/18/9597/2018/acp-18-9597-2018.pdf |