A toolkit for ARB to integrate custom databases and externally built phylogenies.

UNLABELLED:Researchers are perpetually amassing biological sequence data. The computational approaches employed by ecologists for organizing this data (e.g. alignment, phylogeny, etc.) typically scale nonlinearly in execution time with the size of the dataset. This often serves as a bottleneck for p...

Full description

Bibliographic Details
Main Authors:	Steven D Essinger, Erin Reichenberger, Calvin Morrison, Christopher B Blackwood, Gail L Rosen
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2015-01-01
Series:	PLoS ONE
Online Access:	http://europepmc.org/articles/PMC4301908?pdf=render

id	doaj-c18c0ed4ef6e4e2c8ef97267011bcf5a
record_format	Article
spelling	doaj-c18c0ed4ef6e4e2c8ef97267011bcf5a2020-11-25T01:33:17ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-01101e010927710.1371/journal.pone.0109277A toolkit for ARB to integrate custom databases and externally built phylogenies.Steven D EssingerErin ReichenbergerCalvin MorrisonChristopher B BlackwoodGail L RosenUNLABELLED:Researchers are perpetually amassing biological sequence data. The computational approaches employed by ecologists for organizing this data (e.g. alignment, phylogeny, etc.) typically scale nonlinearly in execution time with the size of the dataset. This often serves as a bottleneck for processing experimental data since many molecular studies are characterized by massive datasets. To keep up with experimental data demands, ecologists are forced to choose between continually upgrading expensive in-house computer hardware or outsourcing the most demanding computations to the cloud. Outsourcing is attractive since it is the least expensive option, but does not necessarily allow direct user interaction with the data for exploratory analysis. Desktop analytical tools such as ARB are indispensable for this purpose, but they do not necessarily offer a convenient solution for the coordination and integration of datasets between local and outsourced destinations. Therefore, researchers are currently left with an undesirable tradeoff between computational throughput and analytical capability. To mitigate this tradeoff we introduce a software package to leverage the utility of the interactive exploratory tools offered by ARB with the computational throughput of cloud-based resources. Our pipeline serves as middleware between the desktop and the cloud allowing researchers to form local custom databases containing sequences and metadata from multiple resources and a method for linking data outsourced for computation back to the local database. A tutorial implementation of the toolkit is provided in the supporting information, S1 Tutorial. AVAILABILITY:http://www.ece.drexel.edu/gailr/EESI/tutorial.php.http://europepmc.org/articles/PMC4301908?pdf=render
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Steven D Essinger Erin Reichenberger Calvin Morrison Christopher B Blackwood Gail L Rosen
spellingShingle	Steven D Essinger Erin Reichenberger Calvin Morrison Christopher B Blackwood Gail L Rosen A toolkit for ARB to integrate custom databases and externally built phylogenies. PLoS ONE
author_facet	Steven D Essinger Erin Reichenberger Calvin Morrison Christopher B Blackwood Gail L Rosen
author_sort	Steven D Essinger
title	A toolkit for ARB to integrate custom databases and externally built phylogenies.
title_short	A toolkit for ARB to integrate custom databases and externally built phylogenies.
title_full	A toolkit for ARB to integrate custom databases and externally built phylogenies.
title_fullStr	A toolkit for ARB to integrate custom databases and externally built phylogenies.
title_full_unstemmed	A toolkit for ARB to integrate custom databases and externally built phylogenies.
title_sort	toolkit for arb to integrate custom databases and externally built phylogenies.
publisher	Public Library of Science (PLoS)
series	PLoS ONE
issn	1932-6203
publishDate	2015-01-01
description	UNLABELLED:Researchers are perpetually amassing biological sequence data. The computational approaches employed by ecologists for organizing this data (e.g. alignment, phylogeny, etc.) typically scale nonlinearly in execution time with the size of the dataset. This often serves as a bottleneck for processing experimental data since many molecular studies are characterized by massive datasets. To keep up with experimental data demands, ecologists are forced to choose between continually upgrading expensive in-house computer hardware or outsourcing the most demanding computations to the cloud. Outsourcing is attractive since it is the least expensive option, but does not necessarily allow direct user interaction with the data for exploratory analysis. Desktop analytical tools such as ARB are indispensable for this purpose, but they do not necessarily offer a convenient solution for the coordination and integration of datasets between local and outsourced destinations. Therefore, researchers are currently left with an undesirable tradeoff between computational throughput and analytical capability. To mitigate this tradeoff we introduce a software package to leverage the utility of the interactive exploratory tools offered by ARB with the computational throughput of cloud-based resources. Our pipeline serves as middleware between the desktop and the cloud allowing researchers to form local custom databases containing sequences and metadata from multiple resources and a method for linking data outsourced for computation back to the local database. A tutorial implementation of the toolkit is provided in the supporting information, S1 Tutorial. AVAILABILITY:http://www.ece.drexel.edu/gailr/EESI/tutorial.php.
url	http://europepmc.org/articles/PMC4301908?pdf=render
work_keys_str_mv	AT stevendessinger atoolkitforarbtointegratecustomdatabasesandexternallybuiltphylogenies AT erinreichenberger atoolkitforarbtointegratecustomdatabasesandexternallybuiltphylogenies AT calvinmorrison atoolkitforarbtointegratecustomdatabasesandexternallybuiltphylogenies AT christopherbblackwood atoolkitforarbtointegratecustomdatabasesandexternallybuiltphylogenies AT gaillrosen atoolkitforarbtointegratecustomdatabasesandexternallybuiltphylogenies AT stevendessinger toolkitforarbtointegratecustomdatabasesandexternallybuiltphylogenies AT erinreichenberger toolkitforarbtointegratecustomdatabasesandexternallybuiltphylogenies AT calvinmorrison toolkitforarbtointegratecustomdatabasesandexternallybuiltphylogenies AT christopherbblackwood toolkitforarbtointegratecustomdatabasesandexternallybuiltphylogenies AT gaillrosen toolkitforarbtointegratecustomdatabasesandexternallybuiltphylogenies
_version_	1725078178970468352

A toolkit for ARB to integrate custom databases and externally built phylogenies.

Similar Items