A novel approach for transcription factor analysis using SELEX with high-throughput sequencing (TFAST).
In previous work, we designed a modified aptamer-free SELEX-seq protocol (afSELEX-seq) for the discovery of transcription factor binding sites. Here, we present original software, TFAST, designed to analyze afSELEX-seq data, validated against our previously generated afSELEX-seq dataset and a model...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2012-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC3430675?pdf=render |
id |
doaj-5767dbe86f8e479d99daf961752b2a3b |
---|---|
record_format |
Article |
spelling |
doaj-5767dbe86f8e479d99daf961752b2a3b2020-11-25T01:52:50ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-0178e4276110.1371/journal.pone.0042761A novel approach for transcription factor analysis using SELEX with high-throughput sequencing (TFAST).Daniel J ReissFrederick M HowardHarry L T MobleyIn previous work, we designed a modified aptamer-free SELEX-seq protocol (afSELEX-seq) for the discovery of transcription factor binding sites. Here, we present original software, TFAST, designed to analyze afSELEX-seq data, validated against our previously generated afSELEX-seq dataset and a model dataset. TFAST is designed with a simple graphical interface (Java) so that it can be installed and executed without extensive expertise in bioinformatics. TFAST completes analysis within minutes on most personal computers.Once afSELEX-seq data are aligned to a target genome, TFAST identifies peaks and, uniquely, compares peak characteristics between cycles. TFAST generates a hierarchical report of graded peaks, their associated genomic sequences, binding site length predictions, and dummy sequences.Including additional cycles of afSELEX-seq improved TFAST's ability to selectively identify peaks, leading to 7,274, 4,255, and 2,628 peaks identified in two-, three-, and four-cycle afSELEX-seq. Inter-round analysis by TFAST identified 457 peaks as the strongest candidates for true binding sites. Separating peaks by TFAST into classes of worst, second-best and best candidate peaks revealed a trend of increasing significance (e-values 4.5 × 10(12), 2.9 × 10(-46), and 1.2 × 10(-73)) and informational content (11.0, 11.9, and 12.5 bits over 15 bp) of discovered motifs within each respective class. TFAST also predicted a binding site length (28 bp) consistent with non-computational experimentally derived results for the transcription factor PapX (22 to 29 bp).TFAST offers a novel and intuitive approach for determining DNA binding sites of proteins subjected to afSELEX-seq. Here, we demonstrate that TFAST, using afSELEX-seq data, rapidly and accurately predicted sequence length and motif for a putative transcription factor's binding site.http://europepmc.org/articles/PMC3430675?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Daniel J Reiss Frederick M Howard Harry L T Mobley |
spellingShingle |
Daniel J Reiss Frederick M Howard Harry L T Mobley A novel approach for transcription factor analysis using SELEX with high-throughput sequencing (TFAST). PLoS ONE |
author_facet |
Daniel J Reiss Frederick M Howard Harry L T Mobley |
author_sort |
Daniel J Reiss |
title |
A novel approach for transcription factor analysis using SELEX with high-throughput sequencing (TFAST). |
title_short |
A novel approach for transcription factor analysis using SELEX with high-throughput sequencing (TFAST). |
title_full |
A novel approach for transcription factor analysis using SELEX with high-throughput sequencing (TFAST). |
title_fullStr |
A novel approach for transcription factor analysis using SELEX with high-throughput sequencing (TFAST). |
title_full_unstemmed |
A novel approach for transcription factor analysis using SELEX with high-throughput sequencing (TFAST). |
title_sort |
novel approach for transcription factor analysis using selex with high-throughput sequencing (tfast). |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2012-01-01 |
description |
In previous work, we designed a modified aptamer-free SELEX-seq protocol (afSELEX-seq) for the discovery of transcription factor binding sites. Here, we present original software, TFAST, designed to analyze afSELEX-seq data, validated against our previously generated afSELEX-seq dataset and a model dataset. TFAST is designed with a simple graphical interface (Java) so that it can be installed and executed without extensive expertise in bioinformatics. TFAST completes analysis within minutes on most personal computers.Once afSELEX-seq data are aligned to a target genome, TFAST identifies peaks and, uniquely, compares peak characteristics between cycles. TFAST generates a hierarchical report of graded peaks, their associated genomic sequences, binding site length predictions, and dummy sequences.Including additional cycles of afSELEX-seq improved TFAST's ability to selectively identify peaks, leading to 7,274, 4,255, and 2,628 peaks identified in two-, three-, and four-cycle afSELEX-seq. Inter-round analysis by TFAST identified 457 peaks as the strongest candidates for true binding sites. Separating peaks by TFAST into classes of worst, second-best and best candidate peaks revealed a trend of increasing significance (e-values 4.5 × 10(12), 2.9 × 10(-46), and 1.2 × 10(-73)) and informational content (11.0, 11.9, and 12.5 bits over 15 bp) of discovered motifs within each respective class. TFAST also predicted a binding site length (28 bp) consistent with non-computational experimentally derived results for the transcription factor PapX (22 to 29 bp).TFAST offers a novel and intuitive approach for determining DNA binding sites of proteins subjected to afSELEX-seq. Here, we demonstrate that TFAST, using afSELEX-seq data, rapidly and accurately predicted sequence length and motif for a putative transcription factor's binding site. |
url |
http://europepmc.org/articles/PMC3430675?pdf=render |
work_keys_str_mv |
AT danieljreiss anovelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast AT frederickmhoward anovelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast AT harryltmobley anovelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast AT danieljreiss novelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast AT frederickmhoward novelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast AT harryltmobley novelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast |
_version_ |
1724992667362787328 |