Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes
Nucleotide skew analysis is a versatile method to study the nucleotide composition of RNA/DNA molecules, in particular to reveal characteristic sequence signatures. For instance, skew analysis of the nucleotide bias of several viral RNA genomes indicated that it is enriched in the unpaired, single-s...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2018-01-01
|
Series: | Computational and Mathematical Methods in Medicine |
Online Access: | http://dx.doi.org/10.1155/2018/6490647 |
id |
doaj-ac0a62334e4d46b3ae080aee42423511 |
---|---|
record_format |
Article |
spelling |
doaj-ac0a62334e4d46b3ae080aee424235112020-11-24T21:23:41ZengHindawi LimitedComputational and Mathematical Methods in Medicine1748-670X1748-67182018-01-01201810.1155/2018/64906476490647Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral GenomesFormijn van Hemert0Maarten Jebbink1Andries van der Ark2Frits Scholer3Ben Berkhout4Laboratory of Experimental Virology, Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, NetherlandsLaboratory of Experimental Virology, Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, NetherlandsResearch Institute of Child Development and Education, University of Amsterdam, Amsterdam, NetherlandsMedical Microbiology, Amsterdam UMC, University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, NetherlandsLaboratory of Experimental Virology, Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, NetherlandsNucleotide skew analysis is a versatile method to study the nucleotide composition of RNA/DNA molecules, in particular to reveal characteristic sequence signatures. For instance, skew analysis of the nucleotide bias of several viral RNA genomes indicated that it is enriched in the unpaired, single-stranded genome regions, thus creating an even more striking virus-specific signature. The comparison of skew graphs for many virus isolates or families is difficult, time-consuming, and nonquantitative. Here, we present a procedure for a more simple identification of similarities and dissimilarities between nucleotide skew data of coronavirus, flavivirus, picornavirus, and HIV-1 RNA genomes. Window and step sizes were normalized to correct for differences in length of the viral genome. Cumulative skew data are converted into pairwise Euclidean distance matrices, which can be presented as neighbor-joining trees. We present skew value trees for the four virus families and show that closely related viruses are placed in small clusters. Importantly, the skew value trees are similar to the trees constructed by a “classical” model of evolutionary nucleotide substitution. Thus, we conclude that the simple calculation of Euclidean distances between nucleotide skew data allows an easy and quantitative comparison of characteristic sequence signatures of virus genomes. These results indicate that the Euclidean distance analysis of nucleotide skew data forms a nice addition to the virology toolbox.http://dx.doi.org/10.1155/2018/6490647 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Formijn van Hemert Maarten Jebbink Andries van der Ark Frits Scholer Ben Berkhout |
spellingShingle |
Formijn van Hemert Maarten Jebbink Andries van der Ark Frits Scholer Ben Berkhout Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes Computational and Mathematical Methods in Medicine |
author_facet |
Formijn van Hemert Maarten Jebbink Andries van der Ark Frits Scholer Ben Berkhout |
author_sort |
Formijn van Hemert |
title |
Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes |
title_short |
Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes |
title_full |
Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes |
title_fullStr |
Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes |
title_full_unstemmed |
Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes |
title_sort |
euclidean distance analysis enables nucleotide skew analysis in viral genomes |
publisher |
Hindawi Limited |
series |
Computational and Mathematical Methods in Medicine |
issn |
1748-670X 1748-6718 |
publishDate |
2018-01-01 |
description |
Nucleotide skew analysis is a versatile method to study the nucleotide composition of RNA/DNA molecules, in particular to reveal characteristic sequence signatures. For instance, skew analysis of the nucleotide bias of several viral RNA genomes indicated that it is enriched in the unpaired, single-stranded genome regions, thus creating an even more striking virus-specific signature. The comparison of skew graphs for many virus isolates or families is difficult, time-consuming, and nonquantitative. Here, we present a procedure for a more simple identification of similarities and dissimilarities between nucleotide skew data of coronavirus, flavivirus, picornavirus, and HIV-1 RNA genomes. Window and step sizes were normalized to correct for differences in length of the viral genome. Cumulative skew data are converted into pairwise Euclidean distance matrices, which can be presented as neighbor-joining trees. We present skew value trees for the four virus families and show that closely related viruses are placed in small clusters. Importantly, the skew value trees are similar to the trees constructed by a “classical” model of evolutionary nucleotide substitution. Thus, we conclude that the simple calculation of Euclidean distances between nucleotide skew data allows an easy and quantitative comparison of characteristic sequence signatures of virus genomes. These results indicate that the Euclidean distance analysis of nucleotide skew data forms a nice addition to the virology toolbox. |
url |
http://dx.doi.org/10.1155/2018/6490647 |
work_keys_str_mv |
AT formijnvanhemert euclideandistanceanalysisenablesnucleotideskewanalysisinviralgenomes AT maartenjebbink euclideandistanceanalysisenablesnucleotideskewanalysisinviralgenomes AT andriesvanderark euclideandistanceanalysisenablesnucleotideskewanalysisinviralgenomes AT fritsscholer euclideandistanceanalysisenablesnucleotideskewanalysisinviralgenomes AT benberkhout euclideandistanceanalysisenablesnucleotideskewanalysisinviralgenomes |
_version_ |
1725991678682071040 |