A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1
As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequ...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2016-01-01
|
Series: | Evolutionary Bioinformatics |
Online Access: | https://doi.org/10.4137/EBO.S32757 |
id |
doaj-437d20bdc62d42359fab43702c35c3c9 |
---|---|
record_format |
Article |
spelling |
doaj-437d20bdc62d42359fab43702c35c3c92020-11-25T04:08:57ZengSAGE PublishingEvolutionary Bioinformatics1176-93432016-01-011210.4137/EBO.S32757A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1Steven Reisman0Thomas Hatzopoulos1Konstantin Läufer2George K. Thiruvathukal3Catherine Putonti4Department of Biology, Loyola University Chicago, Chicago, IL, USA.Department of Computer Science, Loyola University Chicago, Chicago, IL, USA.Department of Computer Science, Loyola University Chicago, Chicago, IL, USA.Department of Computer Science, Loyola University Chicago, Chicago, IL, USA.Department of Biology, Loyola University Chicago, Chicago, IL, USA.As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest.https://doi.org/10.4137/EBO.S32757 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Steven Reisman Thomas Hatzopoulos Konstantin Läufer George K. Thiruvathukal Catherine Putonti |
spellingShingle |
Steven Reisman Thomas Hatzopoulos Konstantin Läufer George K. Thiruvathukal Catherine Putonti A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1 Evolutionary Bioinformatics |
author_facet |
Steven Reisman Thomas Hatzopoulos Konstantin Läufer George K. Thiruvathukal Catherine Putonti |
author_sort |
Steven Reisman |
title |
A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1 |
title_short |
A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1 |
title_full |
A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1 |
title_fullStr |
A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1 |
title_full_unstemmed |
A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1 |
title_sort |
polyglot approach to bioinformatics data integration: a phylogenetic analysis of hiv-1 |
publisher |
SAGE Publishing |
series |
Evolutionary Bioinformatics |
issn |
1176-9343 |
publishDate |
2016-01-01 |
description |
As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest. |
url |
https://doi.org/10.4137/EBO.S32757 |
work_keys_str_mv |
AT stevenreisman apolyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1 AT thomashatzopoulos apolyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1 AT konstantinlaufer apolyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1 AT georgekthiruvathukal apolyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1 AT catherineputonti apolyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1 AT stevenreisman polyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1 AT thomashatzopoulos polyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1 AT konstantinlaufer polyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1 AT georgekthiruvathukal polyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1 AT catherineputonti polyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1 |
_version_ |
1724423948957908992 |