Quantized correlation coefficient for measuring reproducibility of ChIP-chip data

<p>Abstract</p> <p>Background</p> <p>Chromatin immunoprecipitation followed by microarray hybridization (ChIP-chip) is used to study protein-DNA interactions and histone modifications on a genome-scale. To ensure data quality, these experiments are usually performed in...

Full description

Bibliographic Details
Main Authors: Kuroda Mitzi I, Peng Shouyong, Park Peter J
Format: Article
Language:English
Published: BMC 2010-07-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/11/399
id doaj-84d2401996db485b9214c9aa0c3c7b2e
record_format Article
spelling doaj-84d2401996db485b9214c9aa0c3c7b2e2020-11-24T21:09:42ZengBMCBMC Bioinformatics1471-21052010-07-0111139910.1186/1471-2105-11-399Quantized correlation coefficient for measuring reproducibility of ChIP-chip dataKuroda Mitzi IPeng ShouyongPark Peter J<p>Abstract</p> <p>Background</p> <p>Chromatin immunoprecipitation followed by microarray hybridization (ChIP-chip) is used to study protein-DNA interactions and histone modifications on a genome-scale. To ensure data quality, these experiments are usually performed in replicates, and a correlation coefficient between replicates is used often to assess reproducibility. However, the correlation coefficient can be misleading because it is affected not only by the reproducibility of the signal but also by the amount of binding signal present in the data.</p> <p>Results</p> <p>We develop the Quantized correlation coefficient (QCC) that is much less dependent on the amount of signal. This involves discretization of data into set of quantiles (quantization), a merging procedure to group the background probes, and recalculation of the Pearson correlation coefficient. This procedure reduces the influence of the background noise on the statistic, which then properly focuses more on the reproducibility of the signal. The performance of this procedure is tested in both simulated and real ChIP-chip data. For replicates with different levels of enrichment over background and coverage, we find that QCC reflects reproducibility more accurately and is more robust than the standard Pearson or Spearman correlation coefficients. The quantization and the merging procedure can also suggest a proper quantile threshold for separating signal from background for further analysis.</p> <p>Conclusions</p> <p>To measure reproducibility of ChIP-chip data correctly, a correlation coefficient that is robust to the amount of signal present should be used. QCC is one such measure. The QCC statistic can also be applied in a variety of other contexts for measuring reproducibility, including analysis of array CGH data for DNA copy number and gene expression data.</p> http://www.biomedcentral.com/1471-2105/11/399
collection DOAJ
language English
format Article
sources DOAJ
author Kuroda Mitzi I
Peng Shouyong
Park Peter J
spellingShingle Kuroda Mitzi I
Peng Shouyong
Park Peter J
Quantized correlation coefficient for measuring reproducibility of ChIP-chip data
BMC Bioinformatics
author_facet Kuroda Mitzi I
Peng Shouyong
Park Peter J
author_sort Kuroda Mitzi I
title Quantized correlation coefficient for measuring reproducibility of ChIP-chip data
title_short Quantized correlation coefficient for measuring reproducibility of ChIP-chip data
title_full Quantized correlation coefficient for measuring reproducibility of ChIP-chip data
title_fullStr Quantized correlation coefficient for measuring reproducibility of ChIP-chip data
title_full_unstemmed Quantized correlation coefficient for measuring reproducibility of ChIP-chip data
title_sort quantized correlation coefficient for measuring reproducibility of chip-chip data
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2010-07-01
description <p>Abstract</p> <p>Background</p> <p>Chromatin immunoprecipitation followed by microarray hybridization (ChIP-chip) is used to study protein-DNA interactions and histone modifications on a genome-scale. To ensure data quality, these experiments are usually performed in replicates, and a correlation coefficient between replicates is used often to assess reproducibility. However, the correlation coefficient can be misleading because it is affected not only by the reproducibility of the signal but also by the amount of binding signal present in the data.</p> <p>Results</p> <p>We develop the Quantized correlation coefficient (QCC) that is much less dependent on the amount of signal. This involves discretization of data into set of quantiles (quantization), a merging procedure to group the background probes, and recalculation of the Pearson correlation coefficient. This procedure reduces the influence of the background noise on the statistic, which then properly focuses more on the reproducibility of the signal. The performance of this procedure is tested in both simulated and real ChIP-chip data. For replicates with different levels of enrichment over background and coverage, we find that QCC reflects reproducibility more accurately and is more robust than the standard Pearson or Spearman correlation coefficients. The quantization and the merging procedure can also suggest a proper quantile threshold for separating signal from background for further analysis.</p> <p>Conclusions</p> <p>To measure reproducibility of ChIP-chip data correctly, a correlation coefficient that is robust to the amount of signal present should be used. QCC is one such measure. The QCC statistic can also be applied in a variety of other contexts for measuring reproducibility, including analysis of array CGH data for DNA copy number and gene expression data.</p>
url http://www.biomedcentral.com/1471-2105/11/399
work_keys_str_mv AT kurodamitzii quantizedcorrelationcoefficientformeasuringreproducibilityofchipchipdata
AT pengshouyong quantizedcorrelationcoefficientformeasuringreproducibilityofchipchipdata
AT parkpeterj quantizedcorrelationcoefficientformeasuringreproducibilityofchipchipdata
_version_ 1716757734613843968