Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data.

Next-generation sequencing (NGS) technologies have matured considerably since their introduction and a focus has been placed on developing sophisticated analytical tools to deal with the amassing volumes of data. Chromatin immunoprecipitation sequencing (ChIP-seq), a major application of NGS, is a w...

Full description

Bibliographic Details
Main Authors:	Haipeng Xing, Yifan Mo, Will Liao, Michael Q Zhang
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2012-01-01
Series:	PLoS Computational Biology
Online Access:	http://europepmc.org/articles/PMC3406014?pdf=render

id	doaj-e5db72d873ba4184bf2508afd79d8940
record_format	Article
spelling	doaj-e5db72d873ba4184bf2508afd79d89402020-11-25T02:31:46ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582012-01-0187e100261310.1371/journal.pcbi.1002613Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data.Haipeng XingYifan MoWill LiaoMichael Q ZhangNext-generation sequencing (NGS) technologies have matured considerably since their introduction and a focus has been placed on developing sophisticated analytical tools to deal with the amassing volumes of data. Chromatin immunoprecipitation sequencing (ChIP-seq), a major application of NGS, is a widely adopted technique for examining protein-DNA interactions and is commonly used to investigate epigenetic signatures of diffuse histone marks. These datasets have notoriously high variance and subtle levels of enrichment across large expanses, making them exceedingly difficult to define. Windows-based, heuristic models and finite-state hidden Markov models (HMMs) have been used with some success in analyzing ChIP-seq data but with lingering limitations. To improve the ability to detect broad regions of enrichment, we developed a stochastic Bayesian Change-Point (BCP) method, which addresses some of these unresolved issues. BCP makes use of recent advances in infinite-state HMMs by obtaining explicit formulas for posterior means of read densities. These posterior means can be used to categorize the genome into enriched and unenriched segments, as is customarily done, or examined for more detailed relationships since the underlying subpeaks are preserved rather than simplified into a binary classification. BCP performs a near exhaustive search of all possible change points between different posterior means at high-resolution to minimize the subjectivity of window sizes and is computationally efficient, due to a speed-up algorithm and the explicit formulas it employs. In the absence of a well-established "gold standard" for diffuse histone mark enrichment, we corroborated BCP's island detection accuracy and reproducibility using various forms of empirical evidence. We show that BCP is especially suited for analysis of diffuse histone ChIP-seq data but also effective in analyzing punctate transcription factor ChIP datasets, making it widely applicable for numerous experiment types.http://europepmc.org/articles/PMC3406014?pdf=render
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Haipeng Xing Yifan Mo Will Liao Michael Q Zhang
spellingShingle	Haipeng Xing Yifan Mo Will Liao Michael Q Zhang Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data. PLoS Computational Biology
author_facet	Haipeng Xing Yifan Mo Will Liao Michael Q Zhang
author_sort	Haipeng Xing
title	Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data.
title_short	Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data.
title_full	Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data.
title_fullStr	Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data.
title_full_unstemmed	Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data.
title_sort	genome-wide localization of protein-dna binding and histone modification by a bayesian change-point method with chip-seq data.
publisher	Public Library of Science (PLoS)
series	PLoS Computational Biology
issn	1553-734X 1553-7358
publishDate	2012-01-01
description	Next-generation sequencing (NGS) technologies have matured considerably since their introduction and a focus has been placed on developing sophisticated analytical tools to deal with the amassing volumes of data. Chromatin immunoprecipitation sequencing (ChIP-seq), a major application of NGS, is a widely adopted technique for examining protein-DNA interactions and is commonly used to investigate epigenetic signatures of diffuse histone marks. These datasets have notoriously high variance and subtle levels of enrichment across large expanses, making them exceedingly difficult to define. Windows-based, heuristic models and finite-state hidden Markov models (HMMs) have been used with some success in analyzing ChIP-seq data but with lingering limitations. To improve the ability to detect broad regions of enrichment, we developed a stochastic Bayesian Change-Point (BCP) method, which addresses some of these unresolved issues. BCP makes use of recent advances in infinite-state HMMs by obtaining explicit formulas for posterior means of read densities. These posterior means can be used to categorize the genome into enriched and unenriched segments, as is customarily done, or examined for more detailed relationships since the underlying subpeaks are preserved rather than simplified into a binary classification. BCP performs a near exhaustive search of all possible change points between different posterior means at high-resolution to minimize the subjectivity of window sizes and is computationally efficient, due to a speed-up algorithm and the explicit formulas it employs. In the absence of a well-established "gold standard" for diffuse histone mark enrichment, we corroborated BCP's island detection accuracy and reproducibility using various forms of empirical evidence. We show that BCP is especially suited for analysis of diffuse histone ChIP-seq data but also effective in analyzing punctate transcription factor ChIP datasets, making it widely applicable for numerous experiment types.
url	http://europepmc.org/articles/PMC3406014?pdf=render
work_keys_str_mv	AT haipengxing genomewidelocalizationofproteindnabindingandhistonemodificationbyabayesianchangepointmethodwithchipseqdata AT yifanmo genomewidelocalizationofproteindnabindingandhistonemodificationbyabayesianchangepointmethodwithchipseqdata AT willliao genomewidelocalizationofproteindnabindingandhistonemodificationbyabayesianchangepointmethodwithchipseqdata AT michaelqzhang genomewidelocalizationofproteindnabindingandhistonemodificationbyabayesianchangepointmethodwithchipseqdata
_version_	1724822175200837632

Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data.

Similar Items