A simple grid implementation with Berkeley Open Infrastructure for Network Computing using BLAST as a model

Development of high-throughput technologies, such as Next-generation sequencing, allows thousands of experiments to be performed simultaneously while reducing resource requirement. Consequently, a massive amount of experiment data is now rapidly generated. Nevertheless, the data are not readily usab...

Full description

Bibliographic Details
Main Authors:	Watthanai Pinthong, Panya Muangruen, Prapat Suriyaphol, Dumrong Mairiang
Format:	Article
Language:	English
Published:	PeerJ Inc. 2016-07-01
Series:	PeerJ
Subjects:	Basic Local Alignment Search Tools (BLAST) Grid computing Data-intensive methods Berkeley Open Infrastructure for Network Computing (BOINC) Next-generation sequencing (NGS)
Online Access:	https://peerj.com/articles/2248.pdf

id	doaj-d1b4193df0e6439ca5f6db5ea578d397
record_format	Article
spelling	doaj-d1b4193df0e6439ca5f6db5ea578d3972020-11-24T23:01:19ZengPeerJ Inc.PeerJ2167-83592016-07-014e224810.7717/peerj.2248A simple grid implementation with Berkeley Open Infrastructure for Network Computing using BLAST as a modelWatthanai Pinthong0Panya Muangruen1Prapat Suriyaphol2Dumrong Mairiang3Department of Anatomy, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, ThailandSiriraj Information Technology Department, Faculty of Medicine Siriraj Hospital, Mahidol University, , ThailandDivision of Bioinformatics and Data Management for Research, Department of Research and Development, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, ThailandMedical Biotechnology Research Laboratory, The National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, Pathumthani, ThailandDevelopment of high-throughput technologies, such as Next-generation sequencing, allows thousands of experiments to be performed simultaneously while reducing resource requirement. Consequently, a massive amount of experiment data is now rapidly generated. Nevertheless, the data are not readily usable or meaningful until they are further analysed and interpreted. Due to the size of the data, a high performance computer (HPC) is required for the analysis and interpretation. However, the HPC is expensive and difficult to access. Other means were developed to allow researchers to acquire the power of HPC without a need to purchase and maintain one such as cloud computing services and grid computing system. In this study, we implemented grid computing in a computer training center environment using Berkeley Open Infrastructure for Network Computing (BOINC) as a job distributor and data manager combining all desktop computers to virtualize the HPC. Fifty desktop computers were used for setting up a grid system during the off-hours. In order to test the performance of the grid system, we adapted the Basic Local Alignment Search Tools (BLAST) to the BOINC system. Sequencing results from Illumina platform were aligned to the human genome database by BLAST on the grid system. The result and processing time were compared to those from a single desktop computer and HPC. The estimated durations of BLAST analysis for 4 million sequence reads on a desktop PC, HPC and the grid system were 568, 24 and 5 days, respectively. Thus, the grid implementation of BLAST by BOINC is an efficient alternative to the HPC for sequence alignment. The grid implementation by BOINC also helped tap unused computing resources during the off-hours and could be easily modified for other available bioinformatics software.https://peerj.com/articles/2248.pdfBasic Local Alignment Search Tools (BLAST)Grid computingData-intensive methodsBerkeley Open Infrastructure for Network Computing (BOINC)Next-generation sequencing (NGS)
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Watthanai Pinthong Panya Muangruen Prapat Suriyaphol Dumrong Mairiang
spellingShingle	Watthanai Pinthong Panya Muangruen Prapat Suriyaphol Dumrong Mairiang A simple grid implementation with Berkeley Open Infrastructure for Network Computing using BLAST as a model PeerJ Basic Local Alignment Search Tools (BLAST) Grid computing Data-intensive methods Berkeley Open Infrastructure for Network Computing (BOINC) Next-generation sequencing (NGS)
author_facet	Watthanai Pinthong Panya Muangruen Prapat Suriyaphol Dumrong Mairiang
author_sort	Watthanai Pinthong
title	A simple grid implementation with Berkeley Open Infrastructure for Network Computing using BLAST as a model
title_short	A simple grid implementation with Berkeley Open Infrastructure for Network Computing using BLAST as a model
title_full	A simple grid implementation with Berkeley Open Infrastructure for Network Computing using BLAST as a model
title_fullStr	A simple grid implementation with Berkeley Open Infrastructure for Network Computing using BLAST as a model
title_full_unstemmed	A simple grid implementation with Berkeley Open Infrastructure for Network Computing using BLAST as a model
title_sort	simple grid implementation with berkeley open infrastructure for network computing using blast as a model
publisher	PeerJ Inc.
series	PeerJ
issn	2167-8359
publishDate	2016-07-01
description	Development of high-throughput technologies, such as Next-generation sequencing, allows thousands of experiments to be performed simultaneously while reducing resource requirement. Consequently, a massive amount of experiment data is now rapidly generated. Nevertheless, the data are not readily usable or meaningful until they are further analysed and interpreted. Due to the size of the data, a high performance computer (HPC) is required for the analysis and interpretation. However, the HPC is expensive and difficult to access. Other means were developed to allow researchers to acquire the power of HPC without a need to purchase and maintain one such as cloud computing services and grid computing system. In this study, we implemented grid computing in a computer training center environment using Berkeley Open Infrastructure for Network Computing (BOINC) as a job distributor and data manager combining all desktop computers to virtualize the HPC. Fifty desktop computers were used for setting up a grid system during the off-hours. In order to test the performance of the grid system, we adapted the Basic Local Alignment Search Tools (BLAST) to the BOINC system. Sequencing results from Illumina platform were aligned to the human genome database by BLAST on the grid system. The result and processing time were compared to those from a single desktop computer and HPC. The estimated durations of BLAST analysis for 4 million sequence reads on a desktop PC, HPC and the grid system were 568, 24 and 5 days, respectively. Thus, the grid implementation of BLAST by BOINC is an efficient alternative to the HPC for sequence alignment. The grid implementation by BOINC also helped tap unused computing resources during the off-hours and could be easily modified for other available bioinformatics software.
topic	Basic Local Alignment Search Tools (BLAST) Grid computing Data-intensive methods Berkeley Open Infrastructure for Network Computing (BOINC) Next-generation sequencing (NGS)
url	https://peerj.com/articles/2248.pdf
work_keys_str_mv	AT watthanaipinthong asimplegridimplementationwithberkeleyopeninfrastructurefornetworkcomputingusingblastasamodel AT panyamuangruen asimplegridimplementationwithberkeleyopeninfrastructurefornetworkcomputingusingblastasamodel AT prapatsuriyaphol asimplegridimplementationwithberkeleyopeninfrastructurefornetworkcomputingusingblastasamodel AT dumrongmairiang asimplegridimplementationwithberkeleyopeninfrastructurefornetworkcomputingusingblastasamodel AT watthanaipinthong simplegridimplementationwithberkeleyopeninfrastructurefornetworkcomputingusingblastasamodel AT panyamuangruen simplegridimplementationwithberkeleyopeninfrastructurefornetworkcomputingusingblastasamodel AT prapatsuriyaphol simplegridimplementationwithberkeleyopeninfrastructurefornetworkcomputingusingblastasamodel AT dumrongmairiang simplegridimplementationwithberkeleyopeninfrastructurefornetworkcomputingusingblastasamodel
_version_	1725640014548697088

A simple grid implementation with Berkeley Open Infrastructure for Network Computing using BLAST as a model

Similar Items