Summary: | 博士 === 國立中央大學 === 物理研究所 === 97 === Segmental duplication is widely held to be a dominant feature in the dynamics of
genome growth and evolution. Yet how this would a ect the global structure of
genomes has not been discused. Here, we identify the equivalent length, Le, of a
genomic sequence as a medium through which that dominance may be discussed
quantitatively. Through examining 865 complete chromosomes we nd the Le for a
genomic sequence to be nearly invariant and remarkably short compared true sequence
length { in terms of the statistics of two-letter words it is about 300 bases long { and
is approximately universal for all (examined) complete chromosomes. We verify this
result to be non-trivial, in particular, not caused by the similarity of sequences in
any commonly held sense, and demonstrate that it is easy to generate genome-like
sequences not having universal Le''s. We establish a causal relation between short
Le and segmental duplication and show that a simple, random-segmental-duplication
driven model for genome growth generates highly diverse genome-like sequences that
have universal Le''s. We postulate a connection between the universal value of Le and
maximum information capacity in genomic sequences and infer that the universality of
Le is a crucial product of the evolution of genome toward maximum tness.
|