A deep auto-encoder model for gene expression prediction

Abstract Background Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to a...

Full description

Bibliographic Details
Main Authors: Rui Xie, Jia Wen, Andrew Quitadamo, Jianlin Cheng, Xinghua Shi
Format: Article
Language:English
Published: BMC 2017-11-01
Series:BMC Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12864-017-4226-0
id doaj-c0e0727401f148a5b77c011e6bd2890a
record_format Article
spelling doaj-c0e0727401f148a5b77c011e6bd2890a2020-11-24T20:47:59ZengBMCBMC Genomics1471-21642017-11-0118S9394910.1186/s12864-017-4226-0A deep auto-encoder model for gene expression predictionRui Xie0Jia Wen1Andrew Quitadamo2Jianlin Cheng3Xinghua Shi4Department of Computer Science, University of Missouri at ColumbiaDepartment of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at CharlotteDepartment of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at CharlotteDepartment of Computer Science, University of Missouri at ColumbiaDepartment of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at CharlotteAbstract Background Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance. Results To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns. Conclusion We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes’ contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics.http://link.springer.com/article/10.1186/s12864-017-4226-0Predictive modelStacked denoising auto-encoderMultilayer perceptronDeep learningGene expression
collection DOAJ
language English
format Article
sources DOAJ
author Rui Xie
Jia Wen
Andrew Quitadamo
Jianlin Cheng
Xinghua Shi
spellingShingle Rui Xie
Jia Wen
Andrew Quitadamo
Jianlin Cheng
Xinghua Shi
A deep auto-encoder model for gene expression prediction
BMC Genomics
Predictive model
Stacked denoising auto-encoder
Multilayer perceptron
Deep learning
Gene expression
author_facet Rui Xie
Jia Wen
Andrew Quitadamo
Jianlin Cheng
Xinghua Shi
author_sort Rui Xie
title A deep auto-encoder model for gene expression prediction
title_short A deep auto-encoder model for gene expression prediction
title_full A deep auto-encoder model for gene expression prediction
title_fullStr A deep auto-encoder model for gene expression prediction
title_full_unstemmed A deep auto-encoder model for gene expression prediction
title_sort deep auto-encoder model for gene expression prediction
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2017-11-01
description Abstract Background Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance. Results To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns. Conclusion We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes’ contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics.
topic Predictive model
Stacked denoising auto-encoder
Multilayer perceptron
Deep learning
Gene expression
url http://link.springer.com/article/10.1186/s12864-017-4226-0
work_keys_str_mv AT ruixie adeepautoencodermodelforgeneexpressionprediction
AT jiawen adeepautoencodermodelforgeneexpressionprediction
AT andrewquitadamo adeepautoencodermodelforgeneexpressionprediction
AT jianlincheng adeepautoencodermodelforgeneexpressionprediction
AT xinghuashi adeepautoencodermodelforgeneexpressionprediction
AT ruixie deepautoencodermodelforgeneexpressionprediction
AT jiawen deepautoencodermodelforgeneexpressionprediction
AT andrewquitadamo deepautoencodermodelforgeneexpressionprediction
AT jianlincheng deepautoencodermodelforgeneexpressionprediction
AT xinghuashi deepautoencodermodelforgeneexpressionprediction
_version_ 1716809318257393664