The New York City COVID-19 Spread in the 2020 Spring: A Study on the Potential Role of Particulate Using Time Series Analysis and Machine Learning

This study investigates the potential association between the daily distribution of the PM<sub>2,5</sub> air pollutant and the initial spreading of COVID-19 in New York City. We study the period from 4 March to 22 March 2020, and apply our analysis to all five counties, including the cit...

Full description

Bibliographic Details
Main Authors: Silvia Mirri, Marco Roccetti, Giovanni Delnevo
Format: Article
Language:English
Published: MDPI AG 2021-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/3/1177
id doaj-2d6355363b40448fade2cd28ec4d669c
record_format Article
spelling doaj-2d6355363b40448fade2cd28ec4d669c2021-01-28T00:05:47ZengMDPI AGApplied Sciences2076-34172021-01-01111177117710.3390/app11031177The New York City COVID-19 Spread in the 2020 Spring: A Study on the Potential Role of Particulate Using Time Series Analysis and Machine LearningSilvia Mirri0Marco Roccetti1Giovanni Delnevo2Department of Computer Science and Engineering, University of Bologna, 40127 Bologna, ItalyDepartment of Computer Science and Engineering, University of Bologna, 40127 Bologna, ItalyDepartment of Computer Science and Engineering, University of Bologna, 40127 Bologna, ItalyThis study investigates the potential association between the daily distribution of the PM<sub>2,5</sub> air pollutant and the initial spreading of COVID-19 in New York City. We study the period from 4 March to 22 March 2020, and apply our analysis to all five counties, including the city, plus seven neighboring counties, including both urban and peripheral districts. Using the <i>Granger</i> causality methodology, and considering the maximum lag period (14 days) between infection and the correspondent diagnosis, we found that the time series of the new daily infections registered in those 12 counties appear to correlate to the time series of the concentrations of the PM<sub>2.5</sub> particulate circulating in the air, with 33 over 36 statistical tests with a <i>p</i>-value less than 0.005, thus confirming such a hypothesis. Moreover, looking for further confirmation of this association, we train four different machine learning algorithms on a portion of those time series. These are able to predict that the number of the new daily infections would have surpassed a given infections threshold for the remaining portion of the series, with an average accuracy ranging from 84% to 95%, depending on the algorithm and/or on the specific county under observation. This is similar to other results obtained from several polluted urban areas, e.g., Wuhan, Xiaogan, and Huanggang in China, and Northern Italy. Our study provides further evidence that ambient air pollutants can be associated with a daily COVID-19 infection incidence.https://www.mdpi.com/2076-3417/11/3/1177COVID-19New York citytime series analysisdaily infectionsair pollutionmachine learning
collection DOAJ
language English
format Article
sources DOAJ
author Silvia Mirri
Marco Roccetti
Giovanni Delnevo
spellingShingle Silvia Mirri
Marco Roccetti
Giovanni Delnevo
The New York City COVID-19 Spread in the 2020 Spring: A Study on the Potential Role of Particulate Using Time Series Analysis and Machine Learning
Applied Sciences
COVID-19
New York city
time series analysis
daily infections
air pollution
machine learning
author_facet Silvia Mirri
Marco Roccetti
Giovanni Delnevo
author_sort Silvia Mirri
title The New York City COVID-19 Spread in the 2020 Spring: A Study on the Potential Role of Particulate Using Time Series Analysis and Machine Learning
title_short The New York City COVID-19 Spread in the 2020 Spring: A Study on the Potential Role of Particulate Using Time Series Analysis and Machine Learning
title_full The New York City COVID-19 Spread in the 2020 Spring: A Study on the Potential Role of Particulate Using Time Series Analysis and Machine Learning
title_fullStr The New York City COVID-19 Spread in the 2020 Spring: A Study on the Potential Role of Particulate Using Time Series Analysis and Machine Learning
title_full_unstemmed The New York City COVID-19 Spread in the 2020 Spring: A Study on the Potential Role of Particulate Using Time Series Analysis and Machine Learning
title_sort new york city covid-19 spread in the 2020 spring: a study on the potential role of particulate using time series analysis and machine learning
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2021-01-01
description This study investigates the potential association between the daily distribution of the PM<sub>2,5</sub> air pollutant and the initial spreading of COVID-19 in New York City. We study the period from 4 March to 22 March 2020, and apply our analysis to all five counties, including the city, plus seven neighboring counties, including both urban and peripheral districts. Using the <i>Granger</i> causality methodology, and considering the maximum lag period (14 days) between infection and the correspondent diagnosis, we found that the time series of the new daily infections registered in those 12 counties appear to correlate to the time series of the concentrations of the PM<sub>2.5</sub> particulate circulating in the air, with 33 over 36 statistical tests with a <i>p</i>-value less than 0.005, thus confirming such a hypothesis. Moreover, looking for further confirmation of this association, we train four different machine learning algorithms on a portion of those time series. These are able to predict that the number of the new daily infections would have surpassed a given infections threshold for the remaining portion of the series, with an average accuracy ranging from 84% to 95%, depending on the algorithm and/or on the specific county under observation. This is similar to other results obtained from several polluted urban areas, e.g., Wuhan, Xiaogan, and Huanggang in China, and Northern Italy. Our study provides further evidence that ambient air pollutants can be associated with a daily COVID-19 infection incidence.
topic COVID-19
New York city
time series analysis
daily infections
air pollution
machine learning
url https://www.mdpi.com/2076-3417/11/3/1177
work_keys_str_mv AT silviamirri thenewyorkcitycovid19spreadinthe2020springastudyonthepotentialroleofparticulateusingtimeseriesanalysisandmachinelearning
AT marcoroccetti thenewyorkcitycovid19spreadinthe2020springastudyonthepotentialroleofparticulateusingtimeseriesanalysisandmachinelearning
AT giovannidelnevo thenewyorkcitycovid19spreadinthe2020springastudyonthepotentialroleofparticulateusingtimeseriesanalysisandmachinelearning
AT silviamirri newyorkcitycovid19spreadinthe2020springastudyonthepotentialroleofparticulateusingtimeseriesanalysisandmachinelearning
AT marcoroccetti newyorkcitycovid19spreadinthe2020springastudyonthepotentialroleofparticulateusingtimeseriesanalysisandmachinelearning
AT giovannidelnevo newyorkcitycovid19spreadinthe2020springastudyonthepotentialroleofparticulateusingtimeseriesanalysisandmachinelearning
_version_ 1724320190333714432