Imputation for transcription factor binding predictions based on deep learning.
Understanding the cell-specific binding patterns of transcription factors (TFs) is fundamental to studying gene regulatory networks in biological systems, for which ChIP-seq not only provides valuable data but is also considered as the gold standard. Despite tremendous efforts from the scientific co...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2017-02-01
|
Series: | PLoS Computational Biology |
Online Access: | http://europepmc.org/articles/PMC5345877?pdf=render |
id |
doaj-5af5c5f6661c4e37aedb59408a42fef7 |
---|---|
record_format |
Article |
spelling |
doaj-5af5c5f6661c4e37aedb59408a42fef72020-11-25T01:12:25ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582017-02-01132e100540310.1371/journal.pcbi.1005403Imputation for transcription factor binding predictions based on deep learning.Qian QinJianxing FengUnderstanding the cell-specific binding patterns of transcription factors (TFs) is fundamental to studying gene regulatory networks in biological systems, for which ChIP-seq not only provides valuable data but is also considered as the gold standard. Despite tremendous efforts from the scientific community to conduct TF ChIP-seq experiments, the available data represent only a limited percentage of ChIP-seq experiments, considering all possible combinations of TFs and cell lines. In this study, we demonstrate a method for accurately predicting cell-specific TF binding for TF-cell line combinations based on only a small fraction (4%) of the combinations using available ChIP-seq data. The proposed model, termed TFImpute, is based on a deep neural network with a multi-task learning setting to borrow information across transcription factors and cell lines. Compared with existing methods, TFImpute achieves comparable accuracy on TF-cell line combinations with ChIP-seq data; moreover, TFImpute achieves better accuracy on TF-cell line combinations without ChIP-seq data. This approach can predict cell line specific enhancer activities in K562 and HepG2 cell lines, as measured by massively parallel reporter assays, and predicts the impact of SNPs on TF binding.http://europepmc.org/articles/PMC5345877?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Qian Qin Jianxing Feng |
spellingShingle |
Qian Qin Jianxing Feng Imputation for transcription factor binding predictions based on deep learning. PLoS Computational Biology |
author_facet |
Qian Qin Jianxing Feng |
author_sort |
Qian Qin |
title |
Imputation for transcription factor binding predictions based on deep learning. |
title_short |
Imputation for transcription factor binding predictions based on deep learning. |
title_full |
Imputation for transcription factor binding predictions based on deep learning. |
title_fullStr |
Imputation for transcription factor binding predictions based on deep learning. |
title_full_unstemmed |
Imputation for transcription factor binding predictions based on deep learning. |
title_sort |
imputation for transcription factor binding predictions based on deep learning. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS Computational Biology |
issn |
1553-734X 1553-7358 |
publishDate |
2017-02-01 |
description |
Understanding the cell-specific binding patterns of transcription factors (TFs) is fundamental to studying gene regulatory networks in biological systems, for which ChIP-seq not only provides valuable data but is also considered as the gold standard. Despite tremendous efforts from the scientific community to conduct TF ChIP-seq experiments, the available data represent only a limited percentage of ChIP-seq experiments, considering all possible combinations of TFs and cell lines. In this study, we demonstrate a method for accurately predicting cell-specific TF binding for TF-cell line combinations based on only a small fraction (4%) of the combinations using available ChIP-seq data. The proposed model, termed TFImpute, is based on a deep neural network with a multi-task learning setting to borrow information across transcription factors and cell lines. Compared with existing methods, TFImpute achieves comparable accuracy on TF-cell line combinations with ChIP-seq data; moreover, TFImpute achieves better accuracy on TF-cell line combinations without ChIP-seq data. This approach can predict cell line specific enhancer activities in K562 and HepG2 cell lines, as measured by massively parallel reporter assays, and predicts the impact of SNPs on TF binding. |
url |
http://europepmc.org/articles/PMC5345877?pdf=render |
work_keys_str_mv |
AT qianqin imputationfortranscriptionfactorbindingpredictionsbasedondeeplearning AT jianxingfeng imputationfortranscriptionfactorbindingpredictionsbasedondeeplearning |
_version_ |
1725166416545447936 |