Dating the Common Ancestor from an NCBI Tree of 83688 High-Quality and Full-Length SARS-CoV-2 Genomes
All dating studies involving SARS-CoV-2 are problematic. Previous studies have dated the most recent common ancestor (MRCA) between SARS-CoV-2 and its close relatives from bats and pangolins. However, the evolutionary rate thus derived is expected to differ from the rate estimated from sequence dive...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-09-01
|
Series: | Viruses |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4915/13/9/1790 |
id |
doaj-4cde0dd3c9034ab78e3c362030ab685c |
---|---|
record_format |
Article |
spelling |
doaj-4cde0dd3c9034ab78e3c362030ab685c2021-09-26T01:37:24ZengMDPI AGViruses1999-49152021-09-01131790179010.3390/v13091790Dating the Common Ancestor from an NCBI Tree of 83688 High-Quality and Full-Length SARS-CoV-2 GenomesXuhua Xia0Department of Biology, University of Ottawa, Marie-Curie Private, Ottawa, ON K1N 9A7, CanadaAll dating studies involving SARS-CoV-2 are problematic. Previous studies have dated the most recent common ancestor (MRCA) between SARS-CoV-2 and its close relatives from bats and pangolins. However, the evolutionary rate thus derived is expected to differ from the rate estimated from sequence divergence of SARS-CoV-2 lineages. Here, I present dating results for the first time from a large phylogenetic tree with 86,582 high-quality full-length SARS-CoV-2 genomes. The tree contains 83,688 genomes with full specification of collection time. Such a large tree spanning a period of about 1.5 years offers an excellent opportunity for dating the MRCA of the sampled SARS-CoV-2 genomes. The MRCA is dated 16 August 2019, with the evolutionary rate estimated to be 0.05526 mutations/genome/day. The Pearson correlation coefficient (r) between the root-to-tip distance (D) and the collection time (T) is 0.86295. The NCBI tree also includes 10 SARS-CoV-2 genomes isolated from cats, collected over roughly the same time span as human COVID-19 infection. The MRCA from these cat-derived SARS-CoV-2 is dated 30 July 2019, with r = 0.98464. While the dating method is well known, I have included detailed illustrations so that anyone can repeat the analysis and obtain the same dating results. With 16 August 2019 as the date of the MRCA of sampled SARS-CoV-2 genomes, archived samples from respiratory or digestive tracts collected around or before 16 August 2019, or those that are not descendants of the existing SARS-CoV-2 lineages, should be particularly valuable for tracing the origin of SARS-CoV-2.https://www.mdpi.com/1999-4915/13/9/1790SARS-CoV-2tip rootingtip datingviral evolutionphylogenyCOVID-19 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xuhua Xia |
spellingShingle |
Xuhua Xia Dating the Common Ancestor from an NCBI Tree of 83688 High-Quality and Full-Length SARS-CoV-2 Genomes Viruses SARS-CoV-2 tip rooting tip dating viral evolution phylogeny COVID-19 |
author_facet |
Xuhua Xia |
author_sort |
Xuhua Xia |
title |
Dating the Common Ancestor from an NCBI Tree of 83688 High-Quality and Full-Length SARS-CoV-2 Genomes |
title_short |
Dating the Common Ancestor from an NCBI Tree of 83688 High-Quality and Full-Length SARS-CoV-2 Genomes |
title_full |
Dating the Common Ancestor from an NCBI Tree of 83688 High-Quality and Full-Length SARS-CoV-2 Genomes |
title_fullStr |
Dating the Common Ancestor from an NCBI Tree of 83688 High-Quality and Full-Length SARS-CoV-2 Genomes |
title_full_unstemmed |
Dating the Common Ancestor from an NCBI Tree of 83688 High-Quality and Full-Length SARS-CoV-2 Genomes |
title_sort |
dating the common ancestor from an ncbi tree of 83688 high-quality and full-length sars-cov-2 genomes |
publisher |
MDPI AG |
series |
Viruses |
issn |
1999-4915 |
publishDate |
2021-09-01 |
description |
All dating studies involving SARS-CoV-2 are problematic. Previous studies have dated the most recent common ancestor (MRCA) between SARS-CoV-2 and its close relatives from bats and pangolins. However, the evolutionary rate thus derived is expected to differ from the rate estimated from sequence divergence of SARS-CoV-2 lineages. Here, I present dating results for the first time from a large phylogenetic tree with 86,582 high-quality full-length SARS-CoV-2 genomes. The tree contains 83,688 genomes with full specification of collection time. Such a large tree spanning a period of about 1.5 years offers an excellent opportunity for dating the MRCA of the sampled SARS-CoV-2 genomes. The MRCA is dated 16 August 2019, with the evolutionary rate estimated to be 0.05526 mutations/genome/day. The Pearson correlation coefficient (r) between the root-to-tip distance (D) and the collection time (T) is 0.86295. The NCBI tree also includes 10 SARS-CoV-2 genomes isolated from cats, collected over roughly the same time span as human COVID-19 infection. The MRCA from these cat-derived SARS-CoV-2 is dated 30 July 2019, with r = 0.98464. While the dating method is well known, I have included detailed illustrations so that anyone can repeat the analysis and obtain the same dating results. With 16 August 2019 as the date of the MRCA of sampled SARS-CoV-2 genomes, archived samples from respiratory or digestive tracts collected around or before 16 August 2019, or those that are not descendants of the existing SARS-CoV-2 lineages, should be particularly valuable for tracing the origin of SARS-CoV-2. |
topic |
SARS-CoV-2 tip rooting tip dating viral evolution phylogeny COVID-19 |
url |
https://www.mdpi.com/1999-4915/13/9/1790 |
work_keys_str_mv |
AT xuhuaxia datingthecommonancestorfromanncbitreeof83688highqualityandfulllengthsarscov2genomes |
_version_ |
1716868613481168896 |