Global Geographic and Temporal Analysis of SARS-CoV-2 Haplotypes Normalized by COVID-19 Cases During the Pandemic

Since the identification of SARS-CoV-2, a large number of genomes have been sequenced with unprecedented speed around the world. This marks a unique opportunity to analyze virus spreading and evolution in a worldwide context. Currently, there is not a useful haplotype description to help to track im...

Full description

Bibliographic Details
Main Authors: Santiago Justo Arevalo, Daniela Zapata Sifuentes, César J. Huallpa, Gianfranco Landa Bianchi, Adriana Castillo Chávez, Romina Garavito-Salini Casas, Guillermo Uceda-Campos, Roberto Pineda Chavarria
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-02-01
Series:Frontiers in Microbiology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmicb.2021.612432/full
id doaj-2612ac8c8f594d53af7582cea3892ff2
record_format Article
spelling doaj-2612ac8c8f594d53af7582cea3892ff22021-03-04T15:42:30ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2021-02-011210.3389/fmicb.2021.612432612432Global Geographic and Temporal Analysis of SARS-CoV-2 Haplotypes Normalized by COVID-19 Cases During the PandemicSantiago Justo Arevalo0Santiago Justo Arevalo1Daniela Zapata Sifuentes2César J. Huallpa3Gianfranco Landa Bianchi4Adriana Castillo Chávez5Romina Garavito-Salini Casas6Guillermo Uceda-Campos7Roberto Pineda Chavarria8Facultad de Ciencias Biológicas, Universidad Ricardo Palma, Lima, PeruDepartment of Biochemistry, Institute of Chemistry, University of São Paulo, São Paulo, BrazilFacultad de Ciencias Biológicas, Universidad Ricardo Palma, Lima, PeruFacultad de Ciencias, Universidad Nacional Agraria La Molina, Lima, PeruFacultad de Ciencias Biológicas, Universidad Ricardo Palma, Lima, PeruFacultad de Ciencias Biológicas, Universidad Ricardo Palma, Lima, PeruFacultad de Ciencias Biológicas, Universidad Ricardo Palma, Lima, PeruFacultad de Ciencias Biológicas, Universidad Nacional Pedro Ruiz Gallo, Lambayeque, PeruFacultad de Ciencias Biológicas, Universidad Ricardo Palma, Lima, PeruSince the identification of SARS-CoV-2, a large number of genomes have been sequenced with unprecedented speed around the world. This marks a unique opportunity to analyze virus spreading and evolution in a worldwide context. Currently, there is not a useful haplotype description to help to track important and globally scattered mutations. Also, differences in the number of sequenced genomes between countries and/or months make it difficult to identify the emergence of haplotypes in regions where few genomes are sequenced but a large number of cases are reported. We propose an approach based on the normalization by COVID-19 cases of relative frequencies of mutations using all the available data to identify major haplotypes. Furthermore, we can use a similar normalization approach to tracking the temporal and geographic distribution of haplotypes in the world. Using 171,461 genomes, we identify five major haplotypes or operational taxonomic units (OTUs) based on nine high-frequency mutations. OTU_3 characterized by mutations R203K and G204R is currently the most frequent haplotype circulating in four of the six continents analyzed (South America, North America, Europe, Asia, Africa, and Oceania). On the other hand, during almost all months analyzed, OTU_5 characterized by the mutation T85I in nsp2 is the most frequent in North America. Recently (since September), OTU_2 has been established as the most frequent in Europe. OTU_1, the ancestor haplotype, is near to extinction showed by its low number of isolations since May. Also, we analyzed whether age, gender, or patient status is more related to a specific OTU. We did not find OTU’s preference for any age group, gender, or patient status. Finally, we discuss structural and functional hypotheses in the most frequently identified mutations, none of those mutations show a clear effect on the transmissibility or pathogenicity.https://www.frontiersin.org/articles/10.3389/fmicb.2021.612432/fullSARS-CoV-2COVID-19viral pandemicphylogenomicglobal analysisepidemiology
collection DOAJ
language English
format Article
sources DOAJ
author Santiago Justo Arevalo
Santiago Justo Arevalo
Daniela Zapata Sifuentes
César J. Huallpa
Gianfranco Landa Bianchi
Adriana Castillo Chávez
Romina Garavito-Salini Casas
Guillermo Uceda-Campos
Roberto Pineda Chavarria
spellingShingle Santiago Justo Arevalo
Santiago Justo Arevalo
Daniela Zapata Sifuentes
César J. Huallpa
Gianfranco Landa Bianchi
Adriana Castillo Chávez
Romina Garavito-Salini Casas
Guillermo Uceda-Campos
Roberto Pineda Chavarria
Global Geographic and Temporal Analysis of SARS-CoV-2 Haplotypes Normalized by COVID-19 Cases During the Pandemic
Frontiers in Microbiology
SARS-CoV-2
COVID-19
viral pandemic
phylogenomic
global analysis
epidemiology
author_facet Santiago Justo Arevalo
Santiago Justo Arevalo
Daniela Zapata Sifuentes
César J. Huallpa
Gianfranco Landa Bianchi
Adriana Castillo Chávez
Romina Garavito-Salini Casas
Guillermo Uceda-Campos
Roberto Pineda Chavarria
author_sort Santiago Justo Arevalo
title Global Geographic and Temporal Analysis of SARS-CoV-2 Haplotypes Normalized by COVID-19 Cases During the Pandemic
title_short Global Geographic and Temporal Analysis of SARS-CoV-2 Haplotypes Normalized by COVID-19 Cases During the Pandemic
title_full Global Geographic and Temporal Analysis of SARS-CoV-2 Haplotypes Normalized by COVID-19 Cases During the Pandemic
title_fullStr Global Geographic and Temporal Analysis of SARS-CoV-2 Haplotypes Normalized by COVID-19 Cases During the Pandemic
title_full_unstemmed Global Geographic and Temporal Analysis of SARS-CoV-2 Haplotypes Normalized by COVID-19 Cases During the Pandemic
title_sort global geographic and temporal analysis of sars-cov-2 haplotypes normalized by covid-19 cases during the pandemic
publisher Frontiers Media S.A.
series Frontiers in Microbiology
issn 1664-302X
publishDate 2021-02-01
description Since the identification of SARS-CoV-2, a large number of genomes have been sequenced with unprecedented speed around the world. This marks a unique opportunity to analyze virus spreading and evolution in a worldwide context. Currently, there is not a useful haplotype description to help to track important and globally scattered mutations. Also, differences in the number of sequenced genomes between countries and/or months make it difficult to identify the emergence of haplotypes in regions where few genomes are sequenced but a large number of cases are reported. We propose an approach based on the normalization by COVID-19 cases of relative frequencies of mutations using all the available data to identify major haplotypes. Furthermore, we can use a similar normalization approach to tracking the temporal and geographic distribution of haplotypes in the world. Using 171,461 genomes, we identify five major haplotypes or operational taxonomic units (OTUs) based on nine high-frequency mutations. OTU_3 characterized by mutations R203K and G204R is currently the most frequent haplotype circulating in four of the six continents analyzed (South America, North America, Europe, Asia, Africa, and Oceania). On the other hand, during almost all months analyzed, OTU_5 characterized by the mutation T85I in nsp2 is the most frequent in North America. Recently (since September), OTU_2 has been established as the most frequent in Europe. OTU_1, the ancestor haplotype, is near to extinction showed by its low number of isolations since May. Also, we analyzed whether age, gender, or patient status is more related to a specific OTU. We did not find OTU’s preference for any age group, gender, or patient status. Finally, we discuss structural and functional hypotheses in the most frequently identified mutations, none of those mutations show a clear effect on the transmissibility or pathogenicity.
topic SARS-CoV-2
COVID-19
viral pandemic
phylogenomic
global analysis
epidemiology
url https://www.frontiersin.org/articles/10.3389/fmicb.2021.612432/full
work_keys_str_mv AT santiagojustoarevalo globalgeographicandtemporalanalysisofsarscov2haplotypesnormalizedbycovid19casesduringthepandemic
AT santiagojustoarevalo globalgeographicandtemporalanalysisofsarscov2haplotypesnormalizedbycovid19casesduringthepandemic
AT danielazapatasifuentes globalgeographicandtemporalanalysisofsarscov2haplotypesnormalizedbycovid19casesduringthepandemic
AT cesarjhuallpa globalgeographicandtemporalanalysisofsarscov2haplotypesnormalizedbycovid19casesduringthepandemic
AT gianfrancolandabianchi globalgeographicandtemporalanalysisofsarscov2haplotypesnormalizedbycovid19casesduringthepandemic
AT adrianacastillochavez globalgeographicandtemporalanalysisofsarscov2haplotypesnormalizedbycovid19casesduringthepandemic
AT rominagaravitosalinicasas globalgeographicandtemporalanalysisofsarscov2haplotypesnormalizedbycovid19casesduringthepandemic
AT guillermoucedacampos globalgeographicandtemporalanalysisofsarscov2haplotypesnormalizedbycovid19casesduringthepandemic
AT robertopinedachavarria globalgeographicandtemporalanalysisofsarscov2haplotypesnormalizedbycovid19casesduringthepandemic
_version_ 1724231782179536896