A rebuttal to the comments on the genome order index and the <it>Z</it>-curve

<p>Abstract</p> <p>Background</p> <p>Elhaik, Graur and Josic recently commented on the genome order index (<it>S</it>) and the <it>Z</it>-curve (Elhaik et al. Biol Direct 2010, 5: 10). <it>S </it>is a quantity defined as <it>S &...

Full description

Bibliographic Details
Main Author: Zhang Ren
Format: Article
Language:English
Published: BMC 2011-02-01
Series:Biology Direct
Online Access:http://www.biology-direct.com/content/6/1/10
id doaj-309601152c2d4b7a99ce186bf5d8d9f6
record_format Article
spelling doaj-309601152c2d4b7a99ce186bf5d8d9f62020-11-25T00:45:22ZengBMCBiology Direct1745-61502011-02-01611010.1186/1745-6150-6-10A rebuttal to the comments on the genome order index and the <it>Z</it>-curveZhang Ren<p>Abstract</p> <p>Background</p> <p>Elhaik, Graur and Josic recently commented on the genome order index (<it>S</it>) and the <it>Z</it>-curve (Elhaik et al. Biol Direct 2010, 5: 10). <it>S </it>is a quantity defined as <it>S </it>= <it>a</it><sup>2 </sup>+ <it>c</it><sup>2 </sup>+ <it>g</it><sup>2 </sup>+ <it>t</it><sup>2</sup>, where <it>a</it>, <it>c</it>, <it>g </it>and <it>t </it>denote corresponding base frequencies. The <it>Z</it>-curve is a three dimensional curve that represents a DNA sequence in the manner that each can be uniquely reconstructed given the other. Elhaik et al. made 4 major claims. 1) In the previous mapping system with the regular tetrahedron, calculation of the radius of the inscribed sphere is "a mathematical error". 2) <it>S </it>follows an exponential distribution and is narrowly distributed with a range of (0.25 - 0.33). 3) Based on the Chargaff's second parity rule (PR2), "<it>S </it>is equivalent to <it>H </it>[Shannon entropy]" and they are derivable from each other. 4) <it>Z</it>-curve "suffers from over dimensionality", because based on the analysis of 235 bacterial genomes, <it>x </it>and <it>y </it>components contributed only less than 1% of the variance and therefore "would be of little use".</p> <p>Results</p> <p>1) Elhaik et al. mistakenly neglected the parameter <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1745-6150-6-10-i1"><m:mrow><m:mn>4</m:mn><m:mo>/</m:mo><m:msqrt><m:mn>3</m:mn></m:msqrt></m:mrow></m:math></inline-formula> when calculating the radius of the inscribed sphere. 2) The exponential distribution of <it>S </it>is a restatement of our previous conclusion, and the range of (0.25 - 0.33) only paraphrases the previously suggested <it>S </it>range (0.25 -1/3). 3) Elhaik et al. incorrectly disregard deviations from PR2 by treating the deviations as 0 altogether, reduce <it>S </it>and <it>H</it>, both having 4 variables, <it>a, c, g </it>and <it>t</it>, into functions of one single variable, <it>a </it>only, and apply this treatment to all DNA sequences as the basis of their "demonstration", which is therefore invalid. 4) Elhaik et al. confuse numeral smallness with biological insignificance, and disregard the distributions of purine/pyrimidine and amino/keto bases (<it>x </it>and <it>y </it>components), the variations of which, although can be less than that of GC content, contain rich information that is important and useful, such as in locating replication origins of bacterial and archaeal genomes, and in studies of gene recognition in various species.</p> <p>Conclusion</p> <p>Elhaik et al. confuse <it>S </it>(a single number) with <it>Z</it>-curve (a series of 3D coordinates), which are distinct. To use <it>S </it>as a case study of <it>Z</it>-curve, by itself, is invalid. <it>S </it>and <it>H </it>are neither equivalent nor derivable from each other. The criticisms of Elhaik, Graur and Josic are wrong.</p> <p>Reviewers</p> <p>This article was reviewed by Erik van Nimwegen.</p> http://www.biology-direct.com/content/6/1/10
collection DOAJ
language English
format Article
sources DOAJ
author Zhang Ren
spellingShingle Zhang Ren
A rebuttal to the comments on the genome order index and the <it>Z</it>-curve
Biology Direct
author_facet Zhang Ren
author_sort Zhang Ren
title A rebuttal to the comments on the genome order index and the <it>Z</it>-curve
title_short A rebuttal to the comments on the genome order index and the <it>Z</it>-curve
title_full A rebuttal to the comments on the genome order index and the <it>Z</it>-curve
title_fullStr A rebuttal to the comments on the genome order index and the <it>Z</it>-curve
title_full_unstemmed A rebuttal to the comments on the genome order index and the <it>Z</it>-curve
title_sort rebuttal to the comments on the genome order index and the <it>z</it>-curve
publisher BMC
series Biology Direct
issn 1745-6150
publishDate 2011-02-01
description <p>Abstract</p> <p>Background</p> <p>Elhaik, Graur and Josic recently commented on the genome order index (<it>S</it>) and the <it>Z</it>-curve (Elhaik et al. Biol Direct 2010, 5: 10). <it>S </it>is a quantity defined as <it>S </it>= <it>a</it><sup>2 </sup>+ <it>c</it><sup>2 </sup>+ <it>g</it><sup>2 </sup>+ <it>t</it><sup>2</sup>, where <it>a</it>, <it>c</it>, <it>g </it>and <it>t </it>denote corresponding base frequencies. The <it>Z</it>-curve is a three dimensional curve that represents a DNA sequence in the manner that each can be uniquely reconstructed given the other. Elhaik et al. made 4 major claims. 1) In the previous mapping system with the regular tetrahedron, calculation of the radius of the inscribed sphere is "a mathematical error". 2) <it>S </it>follows an exponential distribution and is narrowly distributed with a range of (0.25 - 0.33). 3) Based on the Chargaff's second parity rule (PR2), "<it>S </it>is equivalent to <it>H </it>[Shannon entropy]" and they are derivable from each other. 4) <it>Z</it>-curve "suffers from over dimensionality", because based on the analysis of 235 bacterial genomes, <it>x </it>and <it>y </it>components contributed only less than 1% of the variance and therefore "would be of little use".</p> <p>Results</p> <p>1) Elhaik et al. mistakenly neglected the parameter <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1745-6150-6-10-i1"><m:mrow><m:mn>4</m:mn><m:mo>/</m:mo><m:msqrt><m:mn>3</m:mn></m:msqrt></m:mrow></m:math></inline-formula> when calculating the radius of the inscribed sphere. 2) The exponential distribution of <it>S </it>is a restatement of our previous conclusion, and the range of (0.25 - 0.33) only paraphrases the previously suggested <it>S </it>range (0.25 -1/3). 3) Elhaik et al. incorrectly disregard deviations from PR2 by treating the deviations as 0 altogether, reduce <it>S </it>and <it>H</it>, both having 4 variables, <it>a, c, g </it>and <it>t</it>, into functions of one single variable, <it>a </it>only, and apply this treatment to all DNA sequences as the basis of their "demonstration", which is therefore invalid. 4) Elhaik et al. confuse numeral smallness with biological insignificance, and disregard the distributions of purine/pyrimidine and amino/keto bases (<it>x </it>and <it>y </it>components), the variations of which, although can be less than that of GC content, contain rich information that is important and useful, such as in locating replication origins of bacterial and archaeal genomes, and in studies of gene recognition in various species.</p> <p>Conclusion</p> <p>Elhaik et al. confuse <it>S </it>(a single number) with <it>Z</it>-curve (a series of 3D coordinates), which are distinct. To use <it>S </it>as a case study of <it>Z</it>-curve, by itself, is invalid. <it>S </it>and <it>H </it>are neither equivalent nor derivable from each other. The criticisms of Elhaik, Graur and Josic are wrong.</p> <p>Reviewers</p> <p>This article was reviewed by Erik van Nimwegen.</p>
url http://www.biology-direct.com/content/6/1/10
work_keys_str_mv AT zhangren arebuttaltothecommentsonthegenomeorderindexandtheitzitcurve
AT zhangren rebuttaltothecommentsonthegenomeorderindexandtheitzitcurve
_version_ 1725270530847670272