Analyzing COVID‐19 Using Multisource Data: An Integrated Approach of Visualization, Spatial Regression, and Machine Learning

Abstract Coronavirus disease 2019 (COVID‐19), caused by severe acute respiratory syndrome coronavirus 2, was first identified in Wuhan, China, in December 2019. As the number of COVID‐19 infections and deaths worldwide continues to increase rapidly, the prevention and control of COVID‐19 remains urg...

Full description

Bibliographic Details
Main Authors: Chao Wu, Mengjie Zhou, Pengyu Liu, Mengjie Yang
Format: Article
Language:English
Published: American Geophysical Union (AGU) 2021-08-01
Series:GeoHealth
Subjects:
Online Access:https://doi.org/10.1029/2021GH000439
id doaj-6c693887535c4136afebdfa1eaa53472
record_format Article
spelling doaj-6c693887535c4136afebdfa1eaa534722021-08-26T13:41:07ZengAmerican Geophysical Union (AGU)GeoHealth2471-14032021-08-0158n/an/a10.1029/2021GH000439Analyzing COVID‐19 Using Multisource Data: An Integrated Approach of Visualization, Spatial Regression, and Machine LearningChao Wu0Mengjie Zhou1Pengyu Liu2Mengjie Yang3School of Geographic and Biologic Information Nanjing University of Posts and Telecommunications Nanjing ChinaCollege of Resources and Environmental Science Hunan Normal University Changsha ChinaSchool of Geographic and Biologic Information Nanjing University of Posts and Telecommunications Nanjing ChinaCollege of Resources and Environmental Science Hunan Normal University Changsha ChinaAbstract Coronavirus disease 2019 (COVID‐19), caused by severe acute respiratory syndrome coronavirus 2, was first identified in Wuhan, China, in December 2019. As the number of COVID‐19 infections and deaths worldwide continues to increase rapidly, the prevention and control of COVID‐19 remains urgent. This article aims to analyze COVID‐19 from a geographical perspective, and this information can provide useful insights for rapid visualization of spatial‐temporal epidemic information and identification of the factors important to the spread of COVID‐19. A new type of vitalization method, called the point grid map, is integrated with calendar‐based visualization to show the spatial‐temporal variations in COVID‐19. The combination of mixed geographically weighted regression (mixed GWR) and extreme gradient boosting (XGBoost) is used to identify the potential factors and the corresponding importance. The visualization results clearly reflect the spatial‐temporal patterns of COVID‐19. The quantified results reveal that the impact of population outflow from Wuhan is the most important factor and indicate statistically significant spatial heterogeneity. Our results provide insights into how multisource big geodata can be employed within the framework of integrating visualization and analytical methods to characterize COVID‐19 trends. In addition, this work can help understand the influential factors for controlling and preventing epidemics, which is important for policy design and effective decision‐making for controlling COVID‐19. The results reveal that one of the most effective ways to control COVID‐19 include controlling the source of infection, cutting off the transmission route, and protecting vulnerable groups.https://doi.org/10.1029/2021GH000439COVID‐19spatial‐temporal patternsvisualizationmixed GWRXGBoostgeographical perspective
collection DOAJ
language English
format Article
sources DOAJ
author Chao Wu
Mengjie Zhou
Pengyu Liu
Mengjie Yang
spellingShingle Chao Wu
Mengjie Zhou
Pengyu Liu
Mengjie Yang
Analyzing COVID‐19 Using Multisource Data: An Integrated Approach of Visualization, Spatial Regression, and Machine Learning
GeoHealth
COVID‐19
spatial‐temporal patterns
visualization
mixed GWR
XGBoost
geographical perspective
author_facet Chao Wu
Mengjie Zhou
Pengyu Liu
Mengjie Yang
author_sort Chao Wu
title Analyzing COVID‐19 Using Multisource Data: An Integrated Approach of Visualization, Spatial Regression, and Machine Learning
title_short Analyzing COVID‐19 Using Multisource Data: An Integrated Approach of Visualization, Spatial Regression, and Machine Learning
title_full Analyzing COVID‐19 Using Multisource Data: An Integrated Approach of Visualization, Spatial Regression, and Machine Learning
title_fullStr Analyzing COVID‐19 Using Multisource Data: An Integrated Approach of Visualization, Spatial Regression, and Machine Learning
title_full_unstemmed Analyzing COVID‐19 Using Multisource Data: An Integrated Approach of Visualization, Spatial Regression, and Machine Learning
title_sort analyzing covid‐19 using multisource data: an integrated approach of visualization, spatial regression, and machine learning
publisher American Geophysical Union (AGU)
series GeoHealth
issn 2471-1403
publishDate 2021-08-01
description Abstract Coronavirus disease 2019 (COVID‐19), caused by severe acute respiratory syndrome coronavirus 2, was first identified in Wuhan, China, in December 2019. As the number of COVID‐19 infections and deaths worldwide continues to increase rapidly, the prevention and control of COVID‐19 remains urgent. This article aims to analyze COVID‐19 from a geographical perspective, and this information can provide useful insights for rapid visualization of spatial‐temporal epidemic information and identification of the factors important to the spread of COVID‐19. A new type of vitalization method, called the point grid map, is integrated with calendar‐based visualization to show the spatial‐temporal variations in COVID‐19. The combination of mixed geographically weighted regression (mixed GWR) and extreme gradient boosting (XGBoost) is used to identify the potential factors and the corresponding importance. The visualization results clearly reflect the spatial‐temporal patterns of COVID‐19. The quantified results reveal that the impact of population outflow from Wuhan is the most important factor and indicate statistically significant spatial heterogeneity. Our results provide insights into how multisource big geodata can be employed within the framework of integrating visualization and analytical methods to characterize COVID‐19 trends. In addition, this work can help understand the influential factors for controlling and preventing epidemics, which is important for policy design and effective decision‐making for controlling COVID‐19. The results reveal that one of the most effective ways to control COVID‐19 include controlling the source of infection, cutting off the transmission route, and protecting vulnerable groups.
topic COVID‐19
spatial‐temporal patterns
visualization
mixed GWR
XGBoost
geographical perspective
url https://doi.org/10.1029/2021GH000439
work_keys_str_mv AT chaowu analyzingcovid19usingmultisourcedataanintegratedapproachofvisualizationspatialregressionandmachinelearning
AT mengjiezhou analyzingcovid19usingmultisourcedataanintegratedapproachofvisualizationspatialregressionandmachinelearning
AT pengyuliu analyzingcovid19usingmultisourcedataanintegratedapproachofvisualizationspatialregressionandmachinelearning
AT mengjieyang analyzingcovid19usingmultisourcedataanintegratedapproachofvisualizationspatialregressionandmachinelearning
_version_ 1721193915722235904