Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes

Nucleotide skew analysis is a versatile method to study the nucleotide composition of RNA/DNA molecules, in particular to reveal characteristic sequence signatures. For instance, skew analysis of the nucleotide bias of several viral RNA genomes indicated that it is enriched in the unpaired, single-s...

Full description

Bibliographic Details
Main Authors: Formijn van Hemert, Maarten Jebbink, Andries van der Ark, Frits Scholer, Ben Berkhout
Format: Article
Language:English
Published: Hindawi Limited 2018-01-01
Series:Computational and Mathematical Methods in Medicine
Online Access:http://dx.doi.org/10.1155/2018/6490647
id doaj-ac0a62334e4d46b3ae080aee42423511
record_format Article
spelling doaj-ac0a62334e4d46b3ae080aee424235112020-11-24T21:23:41ZengHindawi LimitedComputational and Mathematical Methods in Medicine1748-670X1748-67182018-01-01201810.1155/2018/64906476490647Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral GenomesFormijn van Hemert0Maarten Jebbink1Andries van der Ark2Frits Scholer3Ben Berkhout4Laboratory of Experimental Virology, Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, NetherlandsLaboratory of Experimental Virology, Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, NetherlandsResearch Institute of Child Development and Education, University of Amsterdam, Amsterdam, NetherlandsMedical Microbiology, Amsterdam UMC, University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, NetherlandsLaboratory of Experimental Virology, Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, NetherlandsNucleotide skew analysis is a versatile method to study the nucleotide composition of RNA/DNA molecules, in particular to reveal characteristic sequence signatures. For instance, skew analysis of the nucleotide bias of several viral RNA genomes indicated that it is enriched in the unpaired, single-stranded genome regions, thus creating an even more striking virus-specific signature. The comparison of skew graphs for many virus isolates or families is difficult, time-consuming, and nonquantitative. Here, we present a procedure for a more simple identification of similarities and dissimilarities between nucleotide skew data of coronavirus, flavivirus, picornavirus, and HIV-1 RNA genomes. Window and step sizes were normalized to correct for differences in length of the viral genome. Cumulative skew data are converted into pairwise Euclidean distance matrices, which can be presented as neighbor-joining trees. We present skew value trees for the four virus families and show that closely related viruses are placed in small clusters. Importantly, the skew value trees are similar to the trees constructed by a “classical” model of evolutionary nucleotide substitution. Thus, we conclude that the simple calculation of Euclidean distances between nucleotide skew data allows an easy and quantitative comparison of characteristic sequence signatures of virus genomes. These results indicate that the Euclidean distance analysis of nucleotide skew data forms a nice addition to the virology toolbox.http://dx.doi.org/10.1155/2018/6490647
collection DOAJ
language English
format Article
sources DOAJ
author Formijn van Hemert
Maarten Jebbink
Andries van der Ark
Frits Scholer
Ben Berkhout
spellingShingle Formijn van Hemert
Maarten Jebbink
Andries van der Ark
Frits Scholer
Ben Berkhout
Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes
Computational and Mathematical Methods in Medicine
author_facet Formijn van Hemert
Maarten Jebbink
Andries van der Ark
Frits Scholer
Ben Berkhout
author_sort Formijn van Hemert
title Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes
title_short Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes
title_full Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes
title_fullStr Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes
title_full_unstemmed Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes
title_sort euclidean distance analysis enables nucleotide skew analysis in viral genomes
publisher Hindawi Limited
series Computational and Mathematical Methods in Medicine
issn 1748-670X
1748-6718
publishDate 2018-01-01
description Nucleotide skew analysis is a versatile method to study the nucleotide composition of RNA/DNA molecules, in particular to reveal characteristic sequence signatures. For instance, skew analysis of the nucleotide bias of several viral RNA genomes indicated that it is enriched in the unpaired, single-stranded genome regions, thus creating an even more striking virus-specific signature. The comparison of skew graphs for many virus isolates or families is difficult, time-consuming, and nonquantitative. Here, we present a procedure for a more simple identification of similarities and dissimilarities between nucleotide skew data of coronavirus, flavivirus, picornavirus, and HIV-1 RNA genomes. Window and step sizes were normalized to correct for differences in length of the viral genome. Cumulative skew data are converted into pairwise Euclidean distance matrices, which can be presented as neighbor-joining trees. We present skew value trees for the four virus families and show that closely related viruses are placed in small clusters. Importantly, the skew value trees are similar to the trees constructed by a “classical” model of evolutionary nucleotide substitution. Thus, we conclude that the simple calculation of Euclidean distances between nucleotide skew data allows an easy and quantitative comparison of characteristic sequence signatures of virus genomes. These results indicate that the Euclidean distance analysis of nucleotide skew data forms a nice addition to the virology toolbox.
url http://dx.doi.org/10.1155/2018/6490647
work_keys_str_mv AT formijnvanhemert euclideandistanceanalysisenablesnucleotideskewanalysisinviralgenomes
AT maartenjebbink euclideandistanceanalysisenablesnucleotideskewanalysisinviralgenomes
AT andriesvanderark euclideandistanceanalysisenablesnucleotideskewanalysisinviralgenomes
AT fritsscholer euclideandistanceanalysisenablesnucleotideskewanalysisinviralgenomes
AT benberkhout euclideandistanceanalysisenablesnucleotideskewanalysisinviralgenomes
_version_ 1725991678682071040