RNA-Bloom : de novo RNA-seq assembly with Bloom filters

High-throughput RNA sequencing (RNA-seq) is primarily used in measuring gene expression, quantifying transcript abundance, and building reference transcriptomes. Without bias from a reference sequence, de novo RNA-seq assembly is particularly useful for building new reference transcriptomes, detecti...

Full description

Bibliographic Details
Main Author: Nip, Ka Ming
Language:English
Published: University of British Columbia 2017
Online Access:http://hdl.handle.net/2429/62590
id ndltd-UBC-oai-circle.library.ubc.ca-2429-62590
record_format oai_dc
spelling ndltd-UBC-oai-circle.library.ubc.ca-2429-625902018-01-05T17:29:54Z RNA-Bloom : de novo RNA-seq assembly with Bloom filters Nip, Ka Ming High-throughput RNA sequencing (RNA-seq) is primarily used in measuring gene expression, quantifying transcript abundance, and building reference transcriptomes. Without bias from a reference sequence, de novo RNA-seq assembly is particularly useful for building new reference transcriptomes, detecting fusion genes, and discovering novel spliced transcripts. This is a challenging problem, and to address it at least eight approaches, including Trans-ABySS and Trinity, were developed within the past decade. For instance, using Trinity and 12 CPUs, it takes approximately one and a half day to assemble a human RNA-seq sample of over 100 million read pairs and requires up to 80 GB of memory. While the high memory usage typical of de novo RNA-seq assemblers may be alleviated by distributed computing, access to a high-performance computing environment is a requirement that may be limiting for smaller labs. In my thesis, I present a novel de novo RNA-seq assembler, “RNA-Bloom,” which utilizes compact data structures based on Bloom filters for the storage of k-mer counts and the de Bruijn graph in memory. Compared to Trans-ABySS and Trinity, RNA-Bloom can assemble a human transcriptome with comparable accuracy using nearly half as much memory and half the wall-clock time with 12 threads. Science, Faculty of Alumni Graduate 2017-08-14T16:39:35Z 2017-08-14T16:39:35Z 2017 2017-11 Text Thesis/Dissertation http://hdl.handle.net/2429/62590 eng Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ University of British Columbia
collection NDLTD
language English
sources NDLTD
description High-throughput RNA sequencing (RNA-seq) is primarily used in measuring gene expression, quantifying transcript abundance, and building reference transcriptomes. Without bias from a reference sequence, de novo RNA-seq assembly is particularly useful for building new reference transcriptomes, detecting fusion genes, and discovering novel spliced transcripts. This is a challenging problem, and to address it at least eight approaches, including Trans-ABySS and Trinity, were developed within the past decade. For instance, using Trinity and 12 CPUs, it takes approximately one and a half day to assemble a human RNA-seq sample of over 100 million read pairs and requires up to 80 GB of memory. While the high memory usage typical of de novo RNA-seq assemblers may be alleviated by distributed computing, access to a high-performance computing environment is a requirement that may be limiting for smaller labs. In my thesis, I present a novel de novo RNA-seq assembler, “RNA-Bloom,” which utilizes compact data structures based on Bloom filters for the storage of k-mer counts and the de Bruijn graph in memory. Compared to Trans-ABySS and Trinity, RNA-Bloom can assemble a human transcriptome with comparable accuracy using nearly half as much memory and half the wall-clock time with 12 threads. === Science, Faculty of === Alumni === Graduate
author Nip, Ka Ming
spellingShingle Nip, Ka Ming
RNA-Bloom : de novo RNA-seq assembly with Bloom filters
author_facet Nip, Ka Ming
author_sort Nip, Ka Ming
title RNA-Bloom : de novo RNA-seq assembly with Bloom filters
title_short RNA-Bloom : de novo RNA-seq assembly with Bloom filters
title_full RNA-Bloom : de novo RNA-seq assembly with Bloom filters
title_fullStr RNA-Bloom : de novo RNA-seq assembly with Bloom filters
title_full_unstemmed RNA-Bloom : de novo RNA-seq assembly with Bloom filters
title_sort rna-bloom : de novo rna-seq assembly with bloom filters
publisher University of British Columbia
publishDate 2017
url http://hdl.handle.net/2429/62590
work_keys_str_mv AT nipkaming rnabloomdenovornaseqassemblywithbloomfilters
_version_ 1718585863098597376