Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol
In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs) by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF) spectrometer data. The methods employed in this...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Copernicus Publications
2015-11-01
|
Series: | Atmospheric Measurement Techniques |
Online Access: | http://www.atmos-meas-tech.net/8/4979/2015/amt-8-4979-2015.pdf |
Summary: | In this paper we present improved methods for discriminating and
quantifying primary biological aerosol particles (PBAPs) by applying
hierarchical agglomerative cluster analysis to multi-parameter
ultraviolet-light-induced fluorescence (UV-LIF) spectrometer data. The
methods employed in this study can be applied to data sets in excess
of 1 × 10<sup>6</sup> points on a desktop computer, allowing for each
fluorescent particle in a data set to be explicitly clustered. This
reduces the potential for misattribution found in subsampling and
comparative attribution methods used in previous approaches,
improving our capacity to discriminate and quantify PBAP
meta-classes. We evaluate the performance of several hierarchical
agglomerative cluster analysis linkages and data normalisation
methods using laboratory samples of known particle types and an
ambient data set.
<br><br>
Fluorescent and non-fluorescent polystyrene latex spheres were
sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4)
where the optical size, asymmetry factor and fluorescent
measurements were used as inputs to the analysis package. It was
found that the Ward linkage with <i>z</i>-score or range normalisation
performed best, correctly attributing 98 and 98.1 % of the
data points respectively. The best-performing methods were applied
to the BEACHON-RoMBAS (Bio–hydro–atmosphere interactions of Energy, Aerosols,
Carbon, H2O, Organics and Nitrogen–Rocky Mountain Biogenic Aerosol Study) ambient data set, where it was found that the
<i>z</i>-score and range normalisation methods yield similar results, with
each method producing clusters representative of fungal spores and
bacterial aerosol, consistent with previous results. The <i>z</i>-score
result was compared to clusters generated with previous approaches
(WIBS AnalysiS Program, WASP) where we observe that the subsampling
and comparative attribution method employed by WASP results in the
overestimation of the fungal spore concentration by a factor of 1.5
and the underestimation of bacterial aerosol concentration by
a factor of 5. We suggest that this likely due to errors arising
from misattribution due to poor centroid definition and failure to
assign particles to a cluster as a result of the subsampling and
comparative attribution method employed by WASP. The methods used
here allow for the entire fluorescent population of particles to be
analysed, yielding an explicit cluster attribution for each particle and
improving cluster centroid definition and our capacity to
discriminate and quantify PBAP meta-classes compared to previous approaches. |
---|---|
ISSN: | 1867-1381 1867-8548 |