HCudaBLAST: an implementation of BLAST on Hadoop and Cuda

Abstract The world of DNA sequencing has not only been a difficult field since it was first worked upon, but it is also growing at an exponential rate. The amount of data involved in DNA searching is huge, thereby normal tools or algorithms are not suitable to handle this degree of data processing....

Full description

Bibliographic Details
Main Authors:	Nilay Khare, Alind Khare, Farhan Khan
Format:	Article
Language:	English
Published:	SpringerOpen 2017-11-01
Series:	Journal of Big Data
Subjects:	DNA Searching BLAST CUDA Hadoop
Online Access:	http://link.springer.com/article/10.1186/s40537-017-0102-7

id	doaj-c5ebb937e92e48beaf051d8f3f9d9575
record_format	Article
spelling	doaj-c5ebb937e92e48beaf051d8f3f9d95752020-11-25T00:39:34ZengSpringerOpenJournal of Big Data2196-11152017-11-01411810.1186/s40537-017-0102-7HCudaBLAST: an implementation of BLAST on Hadoop and CudaNilay Khare0Alind Khare1Farhan Khan2Maulana Azad National Institute of TechnologyIIITMaulana Azad National Institute of TechnologyAbstract The world of DNA sequencing has not only been a difficult field since it was first worked upon, but it is also growing at an exponential rate. The amount of data involved in DNA searching is huge, thereby normal tools or algorithms are not suitable to handle this degree of data processing. BLAST is a tool given by National Center for Biotechnology Information (NCBI) to compare nucleotide or protein sequences to sequence databases and calculate the statistical significance of matches. Many variants of BLAST such as blastn, blastp, blastx, etc. are used to search for nucleotides, proteins, nucleotides-to-proteins sequences respectively. GPU-BLAST and HBLAST have already been proposed to handle the vast amount of data involved in searching DNA sequencing and they also speedup the searching process. In this article, we propose a new model for searching DNA sequences—HCudaBLAST. It involves CUDA processing and Hadoop combined for efficient searching. The results recorded after implementing HCudaBLAST are shown. This solution combines the multi-core parallelism of GPGPUs and the scalability feature provided by the Hadoop framework.http://link.springer.com/article/10.1186/s40537-017-0102-7DNA SearchingBLASTCUDAHadoop
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Nilay Khare Alind Khare Farhan Khan
spellingShingle	Nilay Khare Alind Khare Farhan Khan HCudaBLAST: an implementation of BLAST on Hadoop and Cuda Journal of Big Data DNA Searching BLAST CUDA Hadoop
author_facet	Nilay Khare Alind Khare Farhan Khan
author_sort	Nilay Khare
title	HCudaBLAST: an implementation of BLAST on Hadoop and Cuda
title_short	HCudaBLAST: an implementation of BLAST on Hadoop and Cuda
title_full	HCudaBLAST: an implementation of BLAST on Hadoop and Cuda
title_fullStr	HCudaBLAST: an implementation of BLAST on Hadoop and Cuda
title_full_unstemmed	HCudaBLAST: an implementation of BLAST on Hadoop and Cuda
title_sort	hcudablast: an implementation of blast on hadoop and cuda
publisher	SpringerOpen
series	Journal of Big Data
issn	2196-1115
publishDate	2017-11-01
description	Abstract The world of DNA sequencing has not only been a difficult field since it was first worked upon, but it is also growing at an exponential rate. The amount of data involved in DNA searching is huge, thereby normal tools or algorithms are not suitable to handle this degree of data processing. BLAST is a tool given by National Center for Biotechnology Information (NCBI) to compare nucleotide or protein sequences to sequence databases and calculate the statistical significance of matches. Many variants of BLAST such as blastn, blastp, blastx, etc. are used to search for nucleotides, proteins, nucleotides-to-proteins sequences respectively. GPU-BLAST and HBLAST have already been proposed to handle the vast amount of data involved in searching DNA sequencing and they also speedup the searching process. In this article, we propose a new model for searching DNA sequences—HCudaBLAST. It involves CUDA processing and Hadoop combined for efficient searching. The results recorded after implementing HCudaBLAST are shown. This solution combines the multi-core parallelism of GPGPUs and the scalability feature provided by the Hadoop framework.
topic	DNA Searching BLAST CUDA Hadoop
url	http://link.springer.com/article/10.1186/s40537-017-0102-7
work_keys_str_mv	AT nilaykhare hcudablastanimplementationofblastonhadoopandcuda AT alindkhare hcudablastanimplementationofblastonhadoopandcuda AT farhankhan hcudablastanimplementationofblastonhadoopandcuda
_version_	1725293683115294720

HCudaBLAST: an implementation of BLAST on Hadoop and Cuda

Similar Items