A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis

Motivation: Gene set enrichment analysis is a widely accepted expression analysis tool which aims at detecting coordinated expression change within a pre-defined gene sets rather than individual genes. The benefit of gene set analysis over individual differentially expressed (DE) gene analysis inclu...

Full description

Bibliographic Details
Main Authors: Wenyi Qin, Xujun Wang, Hongyu Zhao, Hui Lu
Format: Article
Language:English
Published: Frontiers Media S.A. 2019-04-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fgene.2019.00293/full
id doaj-e32e828d9e4a4022ae4c01d6484a4321
record_format Article
spelling doaj-e32e828d9e4a4022ae4c01d6484a43212020-11-25T00:35:29ZengFrontiers Media S.A.Frontiers in Genetics1664-80212019-04-011010.3389/fgene.2019.00293444629A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic AnalysisWenyi Qin0Wenyi Qin1Wenyi Qin2Xujun Wang3Hongyu Zhao4Hongyu Zhao5Hui Lu6Hui Lu7Hui Lu8Hui Lu9Center for Biomedical Informatics, Shanghai Children's Hospital, Shanghai Jiaotong University, Shanghai, ChinaDepartment of Bioengineering, University of Illinois at Chicago, Chicago, IL, United StatesDepartment of Genetics, School of Medicine, Yale University, New Haven, CT, United StatesDepartment of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiaotong University, Shanghai, ChinaDepartment of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiaotong University, Shanghai, ChinaDepartment of Biostatistics, School of Public Health, Yale University, New Haven, CT, United StatesCenter for Biomedical Informatics, Shanghai Children's Hospital, Shanghai Jiaotong University, Shanghai, ChinaDepartment of Bioengineering, University of Illinois at Chicago, Chicago, IL, United StatesDepartment of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiaotong University, Shanghai, ChinaDepartment of Biostatistics, School of Public Health, Yale University, New Haven, CT, United StatesMotivation: Gene set enrichment analysis is a widely accepted expression analysis tool which aims at detecting coordinated expression change within a pre-defined gene sets rather than individual genes. The benefit of gene set analysis over individual differentially expressed (DE) gene analysis includes more reproducible and interpretable results and detecting small but consistent change among gene set which could not be detected by DE gene analysis. There have been many successful gene set analysis applications in human diseases. However, when the sample size of a disease study is small and no other public data sets of the same disease are available, it will lead to lack of power to detect pathways of importance to the disease.Results: We have developed a novel joint gene set analysis statistical framework which aims at improving the power of identifying enriched gene sets through integrating multiple similar disease data sets. Through comprehensive simulation studies, we demonstrated that our proposed frameworks obtained much better AUC scores than single data set analysis and another meta-analysis method in identification of enriched pathways. When applied to two real data sets, the proposed framework could retain the enriched gene sets identified by single data set analysis and exclusively obtained up to 200% more disease-related gene sets demonstrating the improved identification power through information shared between similar diseases. We expect that the proposed framework would enable researchers to better explore public data sets when the sample size of their study is limited.https://www.frontiersin.org/article/10.3389/fgene.2019.00293/fullpublic data integrationcross disease transcriptomegene expressiongene set enrichment analysismixture modelEM algorithm
collection DOAJ
language English
format Article
sources DOAJ
author Wenyi Qin
Wenyi Qin
Wenyi Qin
Xujun Wang
Hongyu Zhao
Hongyu Zhao
Hui Lu
Hui Lu
Hui Lu
Hui Lu
spellingShingle Wenyi Qin
Wenyi Qin
Wenyi Qin
Xujun Wang
Hongyu Zhao
Hongyu Zhao
Hui Lu
Hui Lu
Hui Lu
Hui Lu
A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis
Frontiers in Genetics
public data integration
cross disease transcriptome
gene expression
gene set enrichment analysis
mixture model
EM algorithm
author_facet Wenyi Qin
Wenyi Qin
Wenyi Qin
Xujun Wang
Hongyu Zhao
Hongyu Zhao
Hui Lu
Hui Lu
Hui Lu
Hui Lu
author_sort Wenyi Qin
title A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis
title_short A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis
title_full A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis
title_fullStr A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis
title_full_unstemmed A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis
title_sort novel joint gene set analysis framework improves identification of enriched pathways in cross disease transcriptomic analysis
publisher Frontiers Media S.A.
series Frontiers in Genetics
issn 1664-8021
publishDate 2019-04-01
description Motivation: Gene set enrichment analysis is a widely accepted expression analysis tool which aims at detecting coordinated expression change within a pre-defined gene sets rather than individual genes. The benefit of gene set analysis over individual differentially expressed (DE) gene analysis includes more reproducible and interpretable results and detecting small but consistent change among gene set which could not be detected by DE gene analysis. There have been many successful gene set analysis applications in human diseases. However, when the sample size of a disease study is small and no other public data sets of the same disease are available, it will lead to lack of power to detect pathways of importance to the disease.Results: We have developed a novel joint gene set analysis statistical framework which aims at improving the power of identifying enriched gene sets through integrating multiple similar disease data sets. Through comprehensive simulation studies, we demonstrated that our proposed frameworks obtained much better AUC scores than single data set analysis and another meta-analysis method in identification of enriched pathways. When applied to two real data sets, the proposed framework could retain the enriched gene sets identified by single data set analysis and exclusively obtained up to 200% more disease-related gene sets demonstrating the improved identification power through information shared between similar diseases. We expect that the proposed framework would enable researchers to better explore public data sets when the sample size of their study is limited.
topic public data integration
cross disease transcriptome
gene expression
gene set enrichment analysis
mixture model
EM algorithm
url https://www.frontiersin.org/article/10.3389/fgene.2019.00293/full
work_keys_str_mv AT wenyiqin anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT wenyiqin anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT wenyiqin anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT xujunwang anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT hongyuzhao anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT hongyuzhao anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT huilu anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT huilu anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT huilu anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT huilu anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT wenyiqin noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT wenyiqin noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT wenyiqin noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT xujunwang noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT hongyuzhao noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT hongyuzhao noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT huilu noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT huilu noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT huilu noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
AT huilu noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis
_version_ 1725308969535143936