First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_
Abstract Background Exploring cellular responses to stimuli using extensive gene expression profiles has become a routine procedure performed on a daily basis. Raw and processed data from these studies are available on public databases but the opportunity to fully exploit such rich datasets is limit...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-01-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-019-2643-6 |
id |
doaj-06e353c3ecb04cfa82237af96a1d04c4 |
---|---|
record_format |
Article |
spelling |
doaj-06e353c3ecb04cfa82237af96a1d04c42020-11-25T01:59:04ZengBMCBMC Bioinformatics1471-21052019-01-012011910.1186/s12859-019-2643-6First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_Marco Moretto0Paolo Sonego1Ana B. Villaseñor-Altamirano2Kristof Engelen3Unit of Computational Biology, Research and Innovation Centre, Fondazione Edmund MachUnit of Computational Biology, Research and Innovation Centre, Fondazione Edmund MachLaboratorio Internacional de Investigación Sobre el Genoma Humano, Universidad Nacional Autónoma De MéxicoUnit of Computational Biology, Research and Innovation Centre, Fondazione Edmund MachAbstract Background Exploring cellular responses to stimuli using extensive gene expression profiles has become a routine procedure performed on a daily basis. Raw and processed data from these studies are available on public databases but the opportunity to fully exploit such rich datasets is limited due to the large heterogeneity of data formats. In recent years, several approaches have been proposed to effectively integrate gene expression data for analysis and exploration at a broader level. Despite the different goals and approaches towards gene expression data integration, the first step is common to any proposed method: data acquisition. Although it is seemingly straightforward to extract valuable information from a set of downloaded files, things can rapidly get complicated, especially as the number of experiments grows. Transcriptomic datasets are deposited in public databases with little regard to data format and thus retrieving raw data might become a challenging task. While for RNA-seq experiments such problem is partially mitigated by the fact that raw reads are generally available on databases such as the NCBI SRA, for microarray experiments standards are not equally well established, or enforced during submission, and thus a multitude of data formats has emerged. Results COMMAND>_ is a specialized tool meant to simplify gene expression data acquisition. It is a flexible multi-user web-application that allows users to search and download gene expression experiments, extract only the relevant information from experiment files, re-annotate microarray platforms, and present data in a simple and coherent data model for subsequent analysis. Conclusions COMMAND>_ facilitates the creation of local datasets of gene expression data coming from both microarray and RNA-seq experiments and may be a more efficient tool to build integrated gene expression compendia. COMMAND>_ is free and open-source software, including publicly available tutorials and documentation.http://link.springer.com/article/10.1186/s12859-019-2643-6TranscriptomicGene expressionMicroarrayRna-seqCompendiaData integration |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Marco Moretto Paolo Sonego Ana B. Villaseñor-Altamirano Kristof Engelen |
spellingShingle |
Marco Moretto Paolo Sonego Ana B. Villaseñor-Altamirano Kristof Engelen First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_ BMC Bioinformatics Transcriptomic Gene expression Microarray Rna-seq Compendia Data integration |
author_facet |
Marco Moretto Paolo Sonego Ana B. Villaseñor-Altamirano Kristof Engelen |
author_sort |
Marco Moretto |
title |
First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_ |
title_short |
First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_ |
title_full |
First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_ |
title_fullStr |
First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_ |
title_full_unstemmed |
First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_ |
title_sort |
first step toward gene expression data integration: transcriptomic data acquisition with command>_ |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2019-01-01 |
description |
Abstract Background Exploring cellular responses to stimuli using extensive gene expression profiles has become a routine procedure performed on a daily basis. Raw and processed data from these studies are available on public databases but the opportunity to fully exploit such rich datasets is limited due to the large heterogeneity of data formats. In recent years, several approaches have been proposed to effectively integrate gene expression data for analysis and exploration at a broader level. Despite the different goals and approaches towards gene expression data integration, the first step is common to any proposed method: data acquisition. Although it is seemingly straightforward to extract valuable information from a set of downloaded files, things can rapidly get complicated, especially as the number of experiments grows. Transcriptomic datasets are deposited in public databases with little regard to data format and thus retrieving raw data might become a challenging task. While for RNA-seq experiments such problem is partially mitigated by the fact that raw reads are generally available on databases such as the NCBI SRA, for microarray experiments standards are not equally well established, or enforced during submission, and thus a multitude of data formats has emerged. Results COMMAND>_ is a specialized tool meant to simplify gene expression data acquisition. It is a flexible multi-user web-application that allows users to search and download gene expression experiments, extract only the relevant information from experiment files, re-annotate microarray platforms, and present data in a simple and coherent data model for subsequent analysis. Conclusions COMMAND>_ facilitates the creation of local datasets of gene expression data coming from both microarray and RNA-seq experiments and may be a more efficient tool to build integrated gene expression compendia. COMMAND>_ is free and open-source software, including publicly available tutorials and documentation. |
topic |
Transcriptomic Gene expression Microarray Rna-seq Compendia Data integration |
url |
http://link.springer.com/article/10.1186/s12859-019-2643-6 |
work_keys_str_mv |
AT marcomoretto firststeptowardgeneexpressiondataintegrationtranscriptomicdataacquisitionwithcommand AT paolosonego firststeptowardgeneexpressiondataintegrationtranscriptomicdataacquisitionwithcommand AT anabvillasenoraltamirano firststeptowardgeneexpressiondataintegrationtranscriptomicdataacquisitionwithcommand AT kristofengelen firststeptowardgeneexpressiondataintegrationtranscriptomicdataacquisitionwithcommand |
_version_ |
1724966010293846016 |