The Average Mutual Information Profile as a Genomic Signature

<p>Abstract</p> <p>Background</p> <p>Occult organizational structures in DNA sequences may hold the key to understanding functional and evolutionary aspects of the DNA molecule. Such structures can also provide the means for identifying and discriminating organisms usin...

Full description

Bibliographic Details
Main Authors: Schuster Sheldon M, Bauer Mark, Sayood Khalid
Format: Article
Language:English
Published: BMC 2008-01-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/9/48
id doaj-f2d8745ee85c457cb070ee342894d8f9
record_format Article
spelling doaj-f2d8745ee85c457cb070ee342894d8f92020-11-24T21:40:02ZengBMCBMC Bioinformatics1471-21052008-01-01914810.1186/1471-2105-9-48The Average Mutual Information Profile as a Genomic SignatureSchuster Sheldon MBauer MarkSayood Khalid<p>Abstract</p> <p>Background</p> <p>Occult organizational structures in DNA sequences may hold the key to understanding functional and evolutionary aspects of the DNA molecule. Such structures can also provide the means for identifying and discriminating organisms using genomic data. Species specific genomic signatures are useful in a variety of contexts such as evolutionary analysis, assembly and classification of genomic sequences from large uncultivated microbial communities and a rapid identification system in health hazard situations.</p> <p>Results</p> <p>We have analyzed genomic sequences of eukaryotic and prokaryotic chromosomes as well as various subtypes of viruses using an information theoretic framework. We confirm the existence of a species specific average mutual information (AMI) profile. We use these profiles to define a very simple, computationally efficient, alignment free, distance measure that reflects the evolutionary relationships between genomic sequences. We use this distance measure to classify chromosomes according to species of origin, to separate and cluster subtypes of the HIV-1 virus, and classify DNA fragments to species of origin.</p> <p>Conclusion</p> <p>AMI profiles of DNA sequences prove to be species specific and easy to compute. The structure of AMI profiles are conserved, even in short subsequences of a species' genome, rendering a pervasive signature. This signature can be used to classify relatively short DNA fragments to species of origin.</p> http://www.biomedcentral.com/1471-2105/9/48
collection DOAJ
language English
format Article
sources DOAJ
author Schuster Sheldon M
Bauer Mark
Sayood Khalid
spellingShingle Schuster Sheldon M
Bauer Mark
Sayood Khalid
The Average Mutual Information Profile as a Genomic Signature
BMC Bioinformatics
author_facet Schuster Sheldon M
Bauer Mark
Sayood Khalid
author_sort Schuster Sheldon M
title The Average Mutual Information Profile as a Genomic Signature
title_short The Average Mutual Information Profile as a Genomic Signature
title_full The Average Mutual Information Profile as a Genomic Signature
title_fullStr The Average Mutual Information Profile as a Genomic Signature
title_full_unstemmed The Average Mutual Information Profile as a Genomic Signature
title_sort average mutual information profile as a genomic signature
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2008-01-01
description <p>Abstract</p> <p>Background</p> <p>Occult organizational structures in DNA sequences may hold the key to understanding functional and evolutionary aspects of the DNA molecule. Such structures can also provide the means for identifying and discriminating organisms using genomic data. Species specific genomic signatures are useful in a variety of contexts such as evolutionary analysis, assembly and classification of genomic sequences from large uncultivated microbial communities and a rapid identification system in health hazard situations.</p> <p>Results</p> <p>We have analyzed genomic sequences of eukaryotic and prokaryotic chromosomes as well as various subtypes of viruses using an information theoretic framework. We confirm the existence of a species specific average mutual information (AMI) profile. We use these profiles to define a very simple, computationally efficient, alignment free, distance measure that reflects the evolutionary relationships between genomic sequences. We use this distance measure to classify chromosomes according to species of origin, to separate and cluster subtypes of the HIV-1 virus, and classify DNA fragments to species of origin.</p> <p>Conclusion</p> <p>AMI profiles of DNA sequences prove to be species specific and easy to compute. The structure of AMI profiles are conserved, even in short subsequences of a species' genome, rendering a pervasive signature. This signature can be used to classify relatively short DNA fragments to species of origin.</p>
url http://www.biomedcentral.com/1471-2105/9/48
work_keys_str_mv AT schustersheldonm theaveragemutualinformationprofileasagenomicsignature
AT bauermark theaveragemutualinformationprofileasagenomicsignature
AT sayoodkhalid theaveragemutualinformationprofileasagenomicsignature
AT schustersheldonm averagemutualinformationprofileasagenomicsignature
AT bauermark averagemutualinformationprofileasagenomicsignature
AT sayoodkhalid averagemutualinformationprofileasagenomicsignature
_version_ 1725928571028897792