Imputation for transcription factor binding predictions based on deep learning.

Understanding the cell-specific binding patterns of transcription factors (TFs) is fundamental to studying gene regulatory networks in biological systems, for which ChIP-seq not only provides valuable data but is also considered as the gold standard. Despite tremendous efforts from the scientific co...

Full description

Bibliographic Details
Main Authors: Qian Qin, Jianxing Feng
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2017-02-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC5345877?pdf=render
id doaj-5af5c5f6661c4e37aedb59408a42fef7
record_format Article
spelling doaj-5af5c5f6661c4e37aedb59408a42fef72020-11-25T01:12:25ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582017-02-01132e100540310.1371/journal.pcbi.1005403Imputation for transcription factor binding predictions based on deep learning.Qian QinJianxing FengUnderstanding the cell-specific binding patterns of transcription factors (TFs) is fundamental to studying gene regulatory networks in biological systems, for which ChIP-seq not only provides valuable data but is also considered as the gold standard. Despite tremendous efforts from the scientific community to conduct TF ChIP-seq experiments, the available data represent only a limited percentage of ChIP-seq experiments, considering all possible combinations of TFs and cell lines. In this study, we demonstrate a method for accurately predicting cell-specific TF binding for TF-cell line combinations based on only a small fraction (4%) of the combinations using available ChIP-seq data. The proposed model, termed TFImpute, is based on a deep neural network with a multi-task learning setting to borrow information across transcription factors and cell lines. Compared with existing methods, TFImpute achieves comparable accuracy on TF-cell line combinations with ChIP-seq data; moreover, TFImpute achieves better accuracy on TF-cell line combinations without ChIP-seq data. This approach can predict cell line specific enhancer activities in K562 and HepG2 cell lines, as measured by massively parallel reporter assays, and predicts the impact of SNPs on TF binding.http://europepmc.org/articles/PMC5345877?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Qian Qin
Jianxing Feng
spellingShingle Qian Qin
Jianxing Feng
Imputation for transcription factor binding predictions based on deep learning.
PLoS Computational Biology
author_facet Qian Qin
Jianxing Feng
author_sort Qian Qin
title Imputation for transcription factor binding predictions based on deep learning.
title_short Imputation for transcription factor binding predictions based on deep learning.
title_full Imputation for transcription factor binding predictions based on deep learning.
title_fullStr Imputation for transcription factor binding predictions based on deep learning.
title_full_unstemmed Imputation for transcription factor binding predictions based on deep learning.
title_sort imputation for transcription factor binding predictions based on deep learning.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2017-02-01
description Understanding the cell-specific binding patterns of transcription factors (TFs) is fundamental to studying gene regulatory networks in biological systems, for which ChIP-seq not only provides valuable data but is also considered as the gold standard. Despite tremendous efforts from the scientific community to conduct TF ChIP-seq experiments, the available data represent only a limited percentage of ChIP-seq experiments, considering all possible combinations of TFs and cell lines. In this study, we demonstrate a method for accurately predicting cell-specific TF binding for TF-cell line combinations based on only a small fraction (4%) of the combinations using available ChIP-seq data. The proposed model, termed TFImpute, is based on a deep neural network with a multi-task learning setting to borrow information across transcription factors and cell lines. Compared with existing methods, TFImpute achieves comparable accuracy on TF-cell line combinations with ChIP-seq data; moreover, TFImpute achieves better accuracy on TF-cell line combinations without ChIP-seq data. This approach can predict cell line specific enhancer activities in K562 and HepG2 cell lines, as measured by massively parallel reporter assays, and predicts the impact of SNPs on TF binding.
url http://europepmc.org/articles/PMC5345877?pdf=render
work_keys_str_mv AT qianqin imputationfortranscriptionfactorbindingpredictionsbasedondeeplearning
AT jianxingfeng imputationfortranscriptionfactorbindingpredictionsbasedondeeplearning
_version_ 1725166416545447936