Meta-analysis of crowdsourced data compendia suggests pan-disease transcriptional signatures of autoimmunity [version 1; referees: 2 approved]

Background: The proliferation of publicly accessible large-scale biological data together with increasing availability of bioinformatics tools have the potential to transform biomedical research. Here we report a crowdsourcing Jamboree that explored whether a team of volunteer biologists without for...

Full description

Bibliographic Details
Main Authors: William W. Lau, Rachel Sparks, OMiCC Jamboree Working Group, John S. Tsang
Format: Article
Language:English
Published: F1000 Research Ltd 2016-12-01
Series:F1000Research
Subjects:
Online Access:https://f1000research.com/articles/5-2884/v1
id doaj-0bdb4f2b2f9847f9a1d9e1404c17d94d
record_format Article
spelling doaj-0bdb4f2b2f9847f9a1d9e1404c17d94d2020-11-25T02:54:40ZengF1000 Research LtdF1000Research2046-14022016-12-01510.12688/f1000research.10465.111275Meta-analysis of crowdsourced data compendia suggests pan-disease transcriptional signatures of autoimmunity [version 1; referees: 2 approved]William W. Lau0Rachel Sparks1OMiCC Jamboree Working GroupJohn S. Tsang2Office of Intramural Research, Center for Information Technology, National Institutes of Health, Bethesda, Maryland, USASystems Genomics and Bioinformatics Unit, Laboratory of Systems Biology, National Institutes of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland, USASystems Genomics and Bioinformatics Unit, Laboratory of Systems Biology, National Institutes of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland, USABackground: The proliferation of publicly accessible large-scale biological data together with increasing availability of bioinformatics tools have the potential to transform biomedical research. Here we report a crowdsourcing Jamboree that explored whether a team of volunteer biologists without formal bioinformatics training could use OMiCC, a crowdsourcing web platform that facilitates the reuse and (meta-) analysis of public gene expression data, to compile and annotate gene expression data, and design comparisons between disease and control sample groups. Methods: The Jamboree focused on several common human autoimmune diseases, including systemic lupus erythematosus (SLE), multiple sclerosis (MS), type I diabetes (DM1), and rheumatoid arthritis (RA), and the corresponding mouse models. Meta-analyses were performed in OMiCC using comparisons constructed by the participants to identify 1) gene expression signatures for each disease (disease versus healthy controls at the gene expression and biological pathway levels), 2) conserved signatures across all diseases within each species (pan-disease signatures), and 3) conserved signatures between species for each disease and across all diseases (cross-species signatures). Results: A large number of differentially expressed genes were identified for each disease based on meta-analysis, with observed overlap among diseases both within and across species. Gene set/pathway enrichment of upregulated genes suggested conserved signatures (e.g., interferon) across all human and mouse conditions. Conclusions: Our Jamboree exercise provides evidence that when enabled by appropriate tools, a "crowd" of biologists can work together to accelerate the pace by which the increasingly large amounts of public data can be reused and meta-analyzed for generating and testing hypotheses. Our encouraging experience suggests that a similar crowdsourcing approach can be used to explore other biological questions.https://f1000research.com/articles/5-2884/v1Control of Gene ExpressionGenomics
collection DOAJ
language English
format Article
sources DOAJ
author William W. Lau
Rachel Sparks
OMiCC Jamboree Working Group
John S. Tsang
spellingShingle William W. Lau
Rachel Sparks
OMiCC Jamboree Working Group
John S. Tsang
Meta-analysis of crowdsourced data compendia suggests pan-disease transcriptional signatures of autoimmunity [version 1; referees: 2 approved]
F1000Research
Control of Gene Expression
Genomics
author_facet William W. Lau
Rachel Sparks
OMiCC Jamboree Working Group
John S. Tsang
author_sort William W. Lau
title Meta-analysis of crowdsourced data compendia suggests pan-disease transcriptional signatures of autoimmunity [version 1; referees: 2 approved]
title_short Meta-analysis of crowdsourced data compendia suggests pan-disease transcriptional signatures of autoimmunity [version 1; referees: 2 approved]
title_full Meta-analysis of crowdsourced data compendia suggests pan-disease transcriptional signatures of autoimmunity [version 1; referees: 2 approved]
title_fullStr Meta-analysis of crowdsourced data compendia suggests pan-disease transcriptional signatures of autoimmunity [version 1; referees: 2 approved]
title_full_unstemmed Meta-analysis of crowdsourced data compendia suggests pan-disease transcriptional signatures of autoimmunity [version 1; referees: 2 approved]
title_sort meta-analysis of crowdsourced data compendia suggests pan-disease transcriptional signatures of autoimmunity [version 1; referees: 2 approved]
publisher F1000 Research Ltd
series F1000Research
issn 2046-1402
publishDate 2016-12-01
description Background: The proliferation of publicly accessible large-scale biological data together with increasing availability of bioinformatics tools have the potential to transform biomedical research. Here we report a crowdsourcing Jamboree that explored whether a team of volunteer biologists without formal bioinformatics training could use OMiCC, a crowdsourcing web platform that facilitates the reuse and (meta-) analysis of public gene expression data, to compile and annotate gene expression data, and design comparisons between disease and control sample groups. Methods: The Jamboree focused on several common human autoimmune diseases, including systemic lupus erythematosus (SLE), multiple sclerosis (MS), type I diabetes (DM1), and rheumatoid arthritis (RA), and the corresponding mouse models. Meta-analyses were performed in OMiCC using comparisons constructed by the participants to identify 1) gene expression signatures for each disease (disease versus healthy controls at the gene expression and biological pathway levels), 2) conserved signatures across all diseases within each species (pan-disease signatures), and 3) conserved signatures between species for each disease and across all diseases (cross-species signatures). Results: A large number of differentially expressed genes were identified for each disease based on meta-analysis, with observed overlap among diseases both within and across species. Gene set/pathway enrichment of upregulated genes suggested conserved signatures (e.g., interferon) across all human and mouse conditions. Conclusions: Our Jamboree exercise provides evidence that when enabled by appropriate tools, a "crowd" of biologists can work together to accelerate the pace by which the increasingly large amounts of public data can be reused and meta-analyzed for generating and testing hypotheses. Our encouraging experience suggests that a similar crowdsourcing approach can be used to explore other biological questions.
topic Control of Gene Expression
Genomics
url https://f1000research.com/articles/5-2884/v1
work_keys_str_mv AT williamwlau metaanalysisofcrowdsourceddatacompendiasuggestspandiseasetranscriptionalsignaturesofautoimmunityversion1referees2approved
AT rachelsparks metaanalysisofcrowdsourceddatacompendiasuggestspandiseasetranscriptionalsignaturesofautoimmunityversion1referees2approved
AT omiccjamboreeworkinggroup metaanalysisofcrowdsourceddatacompendiasuggestspandiseasetranscriptionalsignaturesofautoimmunityversion1referees2approved
AT johnstsang metaanalysisofcrowdsourceddatacompendiasuggestspandiseasetranscriptionalsignaturesofautoimmunityversion1referees2approved
_version_ 1724719593328476160