Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data

Abstract Background Unsupervised machine learning methods (deep learning) have shown their usefulness with noisy single cell mRNA-sequencing data (scRNA-seq), where the models generalize well, despite the zero-inflation of the data. A class of neural networks, namely autoencoders, has been useful fo...

Full description

Bibliographic Details
Main Authors:	Savvas Kinalis, Finn Cilius Nielsen, Ole Winther, Frederik Otzen Bagger
Format:	Article
Language:	English
Published:	BMC 2019-07-01
Series:	BMC Bioinformatics
Subjects:	Interpretable machine learning Deep learning Neural networks Manifold learning Expression profiles Single-cell RNA-sequencing
Online Access:	http://link.springer.com/article/10.1186/s12859-019-2952-9

id	doaj-3e19d3420d44410db46d089f55470b1e
record_format	Article
spelling	doaj-3e19d3420d44410db46d089f55470b1e2020-11-25T03:23:38ZengBMCBMC Bioinformatics1471-21052019-07-012011910.1186/s12859-019-2952-9Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing dataSavvas Kinalis0Finn Cilius Nielsen1Ole Winther2Frederik Otzen Bagger3Centre for Genomic Medicine Rigshospitalet, University of CopenhagenCentre for Genomic Medicine Rigshospitalet, University of CopenhagenCentre for Genomic Medicine Rigshospitalet, University of CopenhagenCentre for Genomic Medicine Rigshospitalet, University of CopenhagenAbstract Background Unsupervised machine learning methods (deep learning) have shown their usefulness with noisy single cell mRNA-sequencing data (scRNA-seq), where the models generalize well, despite the zero-inflation of the data. A class of neural networks, namely autoencoders, has been useful for denoising of single cell data, imputation of missing values and dimensionality reduction. Results Here, we present a striking feature with the potential to greatly increase the usability of autoencoders: With specialized training, the autoencoder is not only able to generalize over the data, but also to tease apart biologically meaningful modules, which we found encoded in the representation layer of the network. Our model can, from scRNA-seq data, delineate biological meaningful modules that govern a dataset, as well as give information as to which modules are active in each single cell. Importantly, most of these modules can be explained by known biological functions, as provided by the Hallmark gene sets. Conclusions We discover that tailored training of an autoencoder makes it possible to deconvolute biological modules inherent in the data, without any assumptions. By comparisons with gene signatures of canonical pathways we see that the modules are directly interpretable. The scope of this discovery has important implications, as it makes it possible to outline the drivers behind a given effect of a cell. In comparison with other dimensionality reduction methods, or supervised models for classification, our approach has the benefit of both handling well the zero-inflated nature of scRNA-seq, and validating that the model captures relevant information, by establishing a link between input and decoded data. In perspective, our model in combination with clustering methods is able to provide information about which subtype a given single cell belongs to, as well as which biological functions determine that membership.http://link.springer.com/article/10.1186/s12859-019-2952-9Interpretable machine learningDeep learningNeural networksManifold learningExpression profilesSingle-cell RNA-sequencing
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Savvas Kinalis Finn Cilius Nielsen Ole Winther Frederik Otzen Bagger
spellingShingle	Savvas Kinalis Finn Cilius Nielsen Ole Winther Frederik Otzen Bagger Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data BMC Bioinformatics Interpretable machine learning Deep learning Neural networks Manifold learning Expression profiles Single-cell RNA-sequencing
author_facet	Savvas Kinalis Finn Cilius Nielsen Ole Winther Frederik Otzen Bagger
author_sort	Savvas Kinalis
title	Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data
title_short	Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data
title_full	Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data
title_fullStr	Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data
title_full_unstemmed	Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data
title_sort	deconvolution of autoencoders to learn biological regulatory modules from single cell mrna sequencing data
publisher	BMC
series	BMC Bioinformatics
issn	1471-2105
publishDate	2019-07-01
description	Abstract Background Unsupervised machine learning methods (deep learning) have shown their usefulness with noisy single cell mRNA-sequencing data (scRNA-seq), where the models generalize well, despite the zero-inflation of the data. A class of neural networks, namely autoencoders, has been useful for denoising of single cell data, imputation of missing values and dimensionality reduction. Results Here, we present a striking feature with the potential to greatly increase the usability of autoencoders: With specialized training, the autoencoder is not only able to generalize over the data, but also to tease apart biologically meaningful modules, which we found encoded in the representation layer of the network. Our model can, from scRNA-seq data, delineate biological meaningful modules that govern a dataset, as well as give information as to which modules are active in each single cell. Importantly, most of these modules can be explained by known biological functions, as provided by the Hallmark gene sets. Conclusions We discover that tailored training of an autoencoder makes it possible to deconvolute biological modules inherent in the data, without any assumptions. By comparisons with gene signatures of canonical pathways we see that the modules are directly interpretable. The scope of this discovery has important implications, as it makes it possible to outline the drivers behind a given effect of a cell. In comparison with other dimensionality reduction methods, or supervised models for classification, our approach has the benefit of both handling well the zero-inflated nature of scRNA-seq, and validating that the model captures relevant information, by establishing a link between input and decoded data. In perspective, our model in combination with clustering methods is able to provide information about which subtype a given single cell belongs to, as well as which biological functions determine that membership.
topic	Interpretable machine learning Deep learning Neural networks Manifold learning Expression profiles Single-cell RNA-sequencing
url	http://link.springer.com/article/10.1186/s12859-019-2952-9
work_keys_str_mv	AT savvaskinalis deconvolutionofautoencoderstolearnbiologicalregulatorymodulesfromsinglecellmrnasequencingdata AT finnciliusnielsen deconvolutionofautoencoderstolearnbiologicalregulatorymodulesfromsinglecellmrnasequencingdata AT olewinther deconvolutionofautoencoderstolearnbiologicalregulatorymodulesfromsinglecellmrnasequencingdata AT frederikotzenbagger deconvolutionofautoencoderstolearnbiologicalregulatorymodulesfromsinglecellmrnasequencingdata
_version_	1724605349614321664

Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data

Similar Items