Transfer learning compensates limited data, batch effects and technological heterogeneity in single-cell sequencing

Tremendous advances in next-generation sequencing technology have enabled the accumulation of large amounts of omics data in various research areas over the past decade. However, study limitations due to small sample sizes, especially in rare disease clinical research, technological heterogeneity an...

Full description

Bibliographic Details
Main Authors:	Hauschild, A.-C (Author), Heider, D. (Author), Park, Y. (Author)
Format:	Article
Language:	English
Published:	Oxford University Press 2021
Subjects:	Article autoencoder automated pattern recognition back propagation big data bioinformatics classification algorithm comparative study data processing deep neural network dimensionality reduction few-shot learning gene expression gene expression profiling genomics human human tissue k means clustering learning algorithm machine learning meta-transfer learning principal component analysis sample size single cell RNA seq transcriptome transcriptomics transfer of learning
Online Access:	View Fulltext in Publisher


LEADER	02973nam a2200481Ia 4500
001	10.1093-nargab-lqab104
008	220427s2021 CNT 000 0 und d
020			\|a 26319268 (ISSN)
245	1	0	\|a Transfer learning compensates limited data, batch effects and technological heterogeneity in single-cell sequencing
260		0	\|b Oxford University Press \|c 2021
856			\|z View Fulltext in Publisher \|u https://doi.org/10.1093/nargab/lqab104
520	3		\|a Tremendous advances in next-generation sequencing technology have enabled the accumulation of large amounts of omics data in various research areas over the past decade. However, study limitations due to small sample sizes, especially in rare disease clinical research, technological heterogeneity and batch effects limit the applicability of traditional statistics and machine learning analysis. Here, we present a meta-transfer learning approach to transfer knowledge from big data and reduce the search space in data with small sample sizes. Few-shot learning algorithms integrate meta-learning to overcome data scarcity and data heterogeneity by transferring molecular pattern recognition models from datasets of unrelated domains. We explore few-shot learning models with large scale public dataset, TCGA (The Cancer Genome Atlas) and GTEx dataset, and demonstrate their potential as pre-training dataset in other molecular pattern recognition tasks. Our results show that meta-transfer learning is very effective for datasets with a limited sample size. Furthermore, we show that our approach can transfer knowledge across technological heterogeneity, for example, from bulk cell to single-cell data. Our approach can overcome study size constraints, batch effects and technical limitations in analyzing single-cell data by leveraging existing bulk-cell sequencing data. © 2021 The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.
650	0	4	\|a Article
650	0	4	\|a autoencoder
650	0	4	\|a automated pattern recognition
650	0	4	\|a back propagation
650	0	4	\|a big data
650	0	4	\|a bioinformatics
650	0	4	\|a classification algorithm
650	0	4	\|a comparative study
650	0	4	\|a data processing
650	0	4	\|a deep neural network
650	0	4	\|a dimensionality reduction
650	0	4	\|a few-shot learning
650	0	4	\|a gene expression
650	0	4	\|a gene expression profiling
650	0	4	\|a genomics
650	0	4	\|a human
650	0	4	\|a human tissue
650	0	4	\|a k means clustering
650	0	4	\|a learning algorithm
650	0	4	\|a machine learning
650	0	4	\|a meta-transfer learning
650	0	4	\|a principal component analysis
650	0	4	\|a sample size
650	0	4	\|a single cell RNA seq
650	0	4	\|a transcriptome
650	0	4	\|a transcriptomics
650	0	4	\|a transfer of learning
700	1		\|a Hauschild, A.-C. \|e author
700	1		\|a Heider, D. \|e author
700	1		\|a Park, Y. \|e author
773			\|t NAR Genomics and Bioinformatics

Transfer learning compensates limited data, batch effects and technological heterogeneity in single-cell sequencing

Similar Items