A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis
Motivation: Gene set enrichment analysis is a widely accepted expression analysis tool which aims at detecting coordinated expression change within a pre-defined gene sets rather than individual genes. The benefit of gene set analysis over individual differentially expressed (DE) gene analysis inclu...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2019-04-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/article/10.3389/fgene.2019.00293/full |
id |
doaj-e32e828d9e4a4022ae4c01d6484a4321 |
---|---|
record_format |
Article |
spelling |
doaj-e32e828d9e4a4022ae4c01d6484a43212020-11-25T00:35:29ZengFrontiers Media S.A.Frontiers in Genetics1664-80212019-04-011010.3389/fgene.2019.00293444629A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic AnalysisWenyi Qin0Wenyi Qin1Wenyi Qin2Xujun Wang3Hongyu Zhao4Hongyu Zhao5Hui Lu6Hui Lu7Hui Lu8Hui Lu9Center for Biomedical Informatics, Shanghai Children's Hospital, Shanghai Jiaotong University, Shanghai, ChinaDepartment of Bioengineering, University of Illinois at Chicago, Chicago, IL, United StatesDepartment of Genetics, School of Medicine, Yale University, New Haven, CT, United StatesDepartment of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiaotong University, Shanghai, ChinaDepartment of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiaotong University, Shanghai, ChinaDepartment of Biostatistics, School of Public Health, Yale University, New Haven, CT, United StatesCenter for Biomedical Informatics, Shanghai Children's Hospital, Shanghai Jiaotong University, Shanghai, ChinaDepartment of Bioengineering, University of Illinois at Chicago, Chicago, IL, United StatesDepartment of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiaotong University, Shanghai, ChinaDepartment of Biostatistics, School of Public Health, Yale University, New Haven, CT, United StatesMotivation: Gene set enrichment analysis is a widely accepted expression analysis tool which aims at detecting coordinated expression change within a pre-defined gene sets rather than individual genes. The benefit of gene set analysis over individual differentially expressed (DE) gene analysis includes more reproducible and interpretable results and detecting small but consistent change among gene set which could not be detected by DE gene analysis. There have been many successful gene set analysis applications in human diseases. However, when the sample size of a disease study is small and no other public data sets of the same disease are available, it will lead to lack of power to detect pathways of importance to the disease.Results: We have developed a novel joint gene set analysis statistical framework which aims at improving the power of identifying enriched gene sets through integrating multiple similar disease data sets. Through comprehensive simulation studies, we demonstrated that our proposed frameworks obtained much better AUC scores than single data set analysis and another meta-analysis method in identification of enriched pathways. When applied to two real data sets, the proposed framework could retain the enriched gene sets identified by single data set analysis and exclusively obtained up to 200% more disease-related gene sets demonstrating the improved identification power through information shared between similar diseases. We expect that the proposed framework would enable researchers to better explore public data sets when the sample size of their study is limited.https://www.frontiersin.org/article/10.3389/fgene.2019.00293/fullpublic data integrationcross disease transcriptomegene expressiongene set enrichment analysismixture modelEM algorithm |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Wenyi Qin Wenyi Qin Wenyi Qin Xujun Wang Hongyu Zhao Hongyu Zhao Hui Lu Hui Lu Hui Lu Hui Lu |
spellingShingle |
Wenyi Qin Wenyi Qin Wenyi Qin Xujun Wang Hongyu Zhao Hongyu Zhao Hui Lu Hui Lu Hui Lu Hui Lu A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis Frontiers in Genetics public data integration cross disease transcriptome gene expression gene set enrichment analysis mixture model EM algorithm |
author_facet |
Wenyi Qin Wenyi Qin Wenyi Qin Xujun Wang Hongyu Zhao Hongyu Zhao Hui Lu Hui Lu Hui Lu Hui Lu |
author_sort |
Wenyi Qin |
title |
A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis |
title_short |
A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis |
title_full |
A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis |
title_fullStr |
A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis |
title_full_unstemmed |
A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis |
title_sort |
novel joint gene set analysis framework improves identification of enriched pathways in cross disease transcriptomic analysis |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Genetics |
issn |
1664-8021 |
publishDate |
2019-04-01 |
description |
Motivation: Gene set enrichment analysis is a widely accepted expression analysis tool which aims at detecting coordinated expression change within a pre-defined gene sets rather than individual genes. The benefit of gene set analysis over individual differentially expressed (DE) gene analysis includes more reproducible and interpretable results and detecting small but consistent change among gene set which could not be detected by DE gene analysis. There have been many successful gene set analysis applications in human diseases. However, when the sample size of a disease study is small and no other public data sets of the same disease are available, it will lead to lack of power to detect pathways of importance to the disease.Results: We have developed a novel joint gene set analysis statistical framework which aims at improving the power of identifying enriched gene sets through integrating multiple similar disease data sets. Through comprehensive simulation studies, we demonstrated that our proposed frameworks obtained much better AUC scores than single data set analysis and another meta-analysis method in identification of enriched pathways. When applied to two real data sets, the proposed framework could retain the enriched gene sets identified by single data set analysis and exclusively obtained up to 200% more disease-related gene sets demonstrating the improved identification power through information shared between similar diseases. We expect that the proposed framework would enable researchers to better explore public data sets when the sample size of their study is limited. |
topic |
public data integration cross disease transcriptome gene expression gene set enrichment analysis mixture model EM algorithm |
url |
https://www.frontiersin.org/article/10.3389/fgene.2019.00293/full |
work_keys_str_mv |
AT wenyiqin anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT wenyiqin anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT wenyiqin anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT xujunwang anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT hongyuzhao anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT hongyuzhao anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT huilu anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT huilu anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT huilu anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT huilu anoveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT wenyiqin noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT wenyiqin noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT wenyiqin noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT xujunwang noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT hongyuzhao noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT hongyuzhao noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT huilu noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT huilu noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT huilu noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis AT huilu noveljointgenesetanalysisframeworkimprovesidentificationofenrichedpathwaysincrossdiseasetranscriptomicanalysis |
_version_ |
1725308969535143936 |