High throughput profile-profile based fold recognition for the entire human proteome

Abstract Background In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on dema...

Full description

Bibliographic Details
Main Authors:	Sørensen Søren-Aksel, Bryson Kevin, Smith Richard T, McGuffin Liam J, Jones David T
Format:	Article
Language:	English
Published:	BMC 2006-06-01
Series:	BMC Bioinformatics
Online Access:	http://www.biomedcentral.com/1471-2105/7/288

id	doaj-ee5447e357154e1e9253ce61014408ba
record_format	Article
spelling	doaj-ee5447e357154e1e9253ce61014408ba2020-11-24T22:01:01ZengBMCBMC Bioinformatics1471-21052006-06-017128810.1186/1471-2105-7-288High throughput profile-profile based fold recognition for the entire human proteomeSørensen Søren-AkselBryson KevinSmith Richard TMcGuffin Liam JJones David T<p>Abstract</p> <p>Background</p> <p>In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power.</p> <p>In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible.</p> <p>Results</p> <p>We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains.</p> <p>Conclusion</p> <p>This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.</p> http://www.biomedcentral.com/1471-2105/7/288
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Sørensen Søren-Aksel Bryson Kevin Smith Richard T McGuffin Liam J Jones David T
spellingShingle	Sørensen Søren-Aksel Bryson Kevin Smith Richard T McGuffin Liam J Jones David T High throughput profile-profile based fold recognition for the entire human proteome BMC Bioinformatics
author_facet	Sørensen Søren-Aksel Bryson Kevin Smith Richard T McGuffin Liam J Jones David T
author_sort	Sørensen Søren-Aksel
title	High throughput profile-profile based fold recognition for the entire human proteome
title_short	High throughput profile-profile based fold recognition for the entire human proteome
title_full	High throughput profile-profile based fold recognition for the entire human proteome
title_fullStr	High throughput profile-profile based fold recognition for the entire human proteome
title_full_unstemmed	High throughput profile-profile based fold recognition for the entire human proteome
title_sort	high throughput profile-profile based fold recognition for the entire human proteome
publisher	BMC
series	BMC Bioinformatics
issn	1471-2105
publishDate	2006-06-01
description	<p>Abstract</p> <p>Background</p> <p>In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power.</p> <p>In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible.</p> <p>Results</p> <p>We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains.</p> <p>Conclusion</p> <p>This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.</p>
url	http://www.biomedcentral.com/1471-2105/7/288
work_keys_str_mv	AT sørensensørenaksel highthroughputprofileprofilebasedfoldrecognitionfortheentirehumanproteome AT brysonkevin highthroughputprofileprofilebasedfoldrecognitionfortheentirehumanproteome AT smithrichardt highthroughputprofileprofilebasedfoldrecognitionfortheentirehumanproteome AT mcguffinliamj highthroughputprofileprofilebasedfoldrecognitionfortheentirehumanproteome AT jonesdavidt highthroughputprofileprofilebasedfoldrecognitionfortheentirehumanproteome
_version_	1725842182716260352

High throughput profile-profile based fold recognition for the entire human proteome

Similar Items