Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales
The composition of a genome with respect to all possible short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from red...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
American Physical Society
2016-10-01
|
Series: | Physical Review X |
Online Access: | http://doi.org/10.1103/PhysRevX.6.041009 |
id |
doaj-923353c3b1be47e0bded8f40d4d0d6d4 |
---|---|
record_format |
Article |
spelling |
doaj-923353c3b1be47e0bded8f40d4d0d6d42020-11-24T23:34:40ZengAmerican Physical SocietyPhysical Review X2160-33082016-10-016404100910.1103/PhysRevX.6.041009Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time ScalesLong QianEdo KussellThe composition of a genome with respect to all possible short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional DNA binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. We demonstrate that the underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, a signal that we detect in all species across domains of life. We consider the possibility that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Likewise, we show that evolutionary mechanisms based on interference of protein-DNA binding with replication and mutational repair processes could yield similar results and operate with similar rates. On the basis of these modeling and bioinformatic results, we conclude that genome-wide word compositions have been molded by DNA binding proteins acting through tiny evolutionary steps over time scales spanning millions of generations.http://doi.org/10.1103/PhysRevX.6.041009 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Long Qian Edo Kussell |
spellingShingle |
Long Qian Edo Kussell Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales Physical Review X |
author_facet |
Long Qian Edo Kussell |
author_sort |
Long Qian |
title |
Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales |
title_short |
Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales |
title_full |
Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales |
title_fullStr |
Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales |
title_full_unstemmed |
Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales |
title_sort |
genome-wide motif statistics are shaped by dna binding proteins over evolutionary time scales |
publisher |
American Physical Society |
series |
Physical Review X |
issn |
2160-3308 |
publishDate |
2016-10-01 |
description |
The composition of a genome with respect to all possible short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional DNA binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. We demonstrate that the underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, a signal that we detect in all species across domains of life. We consider the possibility that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Likewise, we show that evolutionary mechanisms based on interference of protein-DNA binding with replication and mutational repair processes could yield similar results and operate with similar rates. On the basis of these modeling and bioinformatic results, we conclude that genome-wide word compositions have been molded by DNA binding proteins acting through tiny evolutionary steps over time scales spanning millions of generations. |
url |
http://doi.org/10.1103/PhysRevX.6.041009 |
work_keys_str_mv |
AT longqian genomewidemotifstatisticsareshapedbydnabindingproteinsoverevolutionarytimescales AT edokussell genomewidemotifstatisticsareshapedbydnabindingproteinsoverevolutionarytimescales |
_version_ |
1716298946859499520 |