The Maximal C3 Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and Viruses

In 1996, a set X of 20 trinucleotides was identified in genes of both prokaryotes and eukaryotes which has on average the highest occurrence in reading frame compared to its two shifted frames. Furthermore, this set X has an interesting mathematical property as X is a maximal...

Full description

Bibliographic Details
Main Author: Christian J. Michel
Format: Article
Language:English
Published: MDPI AG 2017-04-01
Series:Life
Subjects:
Online Access:http://www.mdpi.com/2075-1729/7/2/20
id doaj-e2dedb4e90714841b5cdd356847f3c5e
record_format Article
spelling doaj-e2dedb4e90714841b5cdd356847f3c5e2020-11-24T22:57:11ZengMDPI AGLife2075-17292017-04-01722010.3390/life7020020life7020020The Maximal C3 Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and VirusesChristian J. Michel0Theoretical Bioinformatics, ICube, University of Strasbourg, CNRS, 300 Boulevard Sébastien Brant, 67400 Illkirch, FranceIn 1996, a set X of 20 trinucleotides was identified in genes of both prokaryotes and eukaryotes which has on average the highest occurrence in reading frame compared to its two shifted frames. Furthermore, this set X has an interesting mathematical property as X is a maximal C 3 self-complementary trinucleotide circular code. In 2015, by quantifying the inspection approach used in 1996, the circular code X was confirmed in the genes of bacteria and eukaryotes and was also identified in the genes of plasmids and viruses. The method was based on the preferential occurrence of trinucleotides among the three frames at the gene population level. We extend here this definition at the gene level. This new statistical approach considers all the genes, i.e., of large and small lengths, with the same weight for searching the circular code X . As a consequence, the concept of circular code, in particular the reading frame retrieval, is directly associated to each gene. At the gene level, the circular code X is strengthened in the genes of bacteria, eukaryotes, plasmids, and viruses, and is now also identified in the genes of archaea. The genes of mitochondria and chloroplasts contain a subset of the circular code X . Finally, by studying viral genes, the circular code X was found in DNA genomes, RNA genomes, double-stranded genomes, and single-stranded genomes.http://www.mdpi.com/2075-1729/7/2/20circular code in genesDNA genesRNA genesdouble-stranded genessingle-stranded genes
collection DOAJ
language English
format Article
sources DOAJ
author Christian J. Michel
spellingShingle Christian J. Michel
The Maximal C3 Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and Viruses
Life
circular code in genes
DNA genes
RNA genes
double-stranded genes
single-stranded genes
author_facet Christian J. Michel
author_sort Christian J. Michel
title The Maximal C3 Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and Viruses
title_short The Maximal C3 Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and Viruses
title_full The Maximal C3 Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and Viruses
title_fullStr The Maximal C3 Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and Viruses
title_full_unstemmed The Maximal C3 Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and Viruses
title_sort maximal c3 self-complementary trinucleotide circular code x in genes of bacteria, archaea, eukaryotes, plasmids and viruses
publisher MDPI AG
series Life
issn 2075-1729
publishDate 2017-04-01
description In 1996, a set X of 20 trinucleotides was identified in genes of both prokaryotes and eukaryotes which has on average the highest occurrence in reading frame compared to its two shifted frames. Furthermore, this set X has an interesting mathematical property as X is a maximal C 3 self-complementary trinucleotide circular code. In 2015, by quantifying the inspection approach used in 1996, the circular code X was confirmed in the genes of bacteria and eukaryotes and was also identified in the genes of plasmids and viruses. The method was based on the preferential occurrence of trinucleotides among the three frames at the gene population level. We extend here this definition at the gene level. This new statistical approach considers all the genes, i.e., of large and small lengths, with the same weight for searching the circular code X . As a consequence, the concept of circular code, in particular the reading frame retrieval, is directly associated to each gene. At the gene level, the circular code X is strengthened in the genes of bacteria, eukaryotes, plasmids, and viruses, and is now also identified in the genes of archaea. The genes of mitochondria and chloroplasts contain a subset of the circular code X . Finally, by studying viral genes, the circular code X was found in DNA genomes, RNA genomes, double-stranded genomes, and single-stranded genomes.
topic circular code in genes
DNA genes
RNA genes
double-stranded genes
single-stranded genes
url http://www.mdpi.com/2075-1729/7/2/20
work_keys_str_mv AT christianjmichel themaximalc3selfcomplementarytrinucleotidecircularcodexingenesofbacteriaarchaeaeukaryotesplasmidsandviruses
AT christianjmichel maximalc3selfcomplementarytrinucleotidecircularcodexingenesofbacteriaarchaeaeukaryotesplasmidsandviruses
_version_ 1725651403445108736