RNA-Bloom : de novo RNA-seq assembly with Bloom filters
High-throughput RNA sequencing (RNA-seq) is primarily used in measuring gene expression, quantifying transcript abundance, and building reference transcriptomes. Without bias from a reference sequence, de novo RNA-seq assembly is particularly useful for building new reference transcriptomes, detecti...
Main Author: | |
---|---|
Language: | English |
Published: |
University of British Columbia
2017
|
Online Access: | http://hdl.handle.net/2429/62590 |
id |
ndltd-UBC-oai-circle.library.ubc.ca-2429-62590 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UBC-oai-circle.library.ubc.ca-2429-625902018-01-05T17:29:54Z RNA-Bloom : de novo RNA-seq assembly with Bloom filters Nip, Ka Ming High-throughput RNA sequencing (RNA-seq) is primarily used in measuring gene expression, quantifying transcript abundance, and building reference transcriptomes. Without bias from a reference sequence, de novo RNA-seq assembly is particularly useful for building new reference transcriptomes, detecting fusion genes, and discovering novel spliced transcripts. This is a challenging problem, and to address it at least eight approaches, including Trans-ABySS and Trinity, were developed within the past decade. For instance, using Trinity and 12 CPUs, it takes approximately one and a half day to assemble a human RNA-seq sample of over 100 million read pairs and requires up to 80 GB of memory. While the high memory usage typical of de novo RNA-seq assemblers may be alleviated by distributed computing, access to a high-performance computing environment is a requirement that may be limiting for smaller labs. In my thesis, I present a novel de novo RNA-seq assembler, “RNA-Bloom,” which utilizes compact data structures based on Bloom filters for the storage of k-mer counts and the de Bruijn graph in memory. Compared to Trans-ABySS and Trinity, RNA-Bloom can assemble a human transcriptome with comparable accuracy using nearly half as much memory and half the wall-clock time with 12 threads. Science, Faculty of Alumni Graduate 2017-08-14T16:39:35Z 2017-08-14T16:39:35Z 2017 2017-11 Text Thesis/Dissertation http://hdl.handle.net/2429/62590 eng Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ University of British Columbia |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
description |
High-throughput RNA sequencing (RNA-seq) is primarily used in measuring gene expression, quantifying transcript abundance, and building reference transcriptomes. Without bias from a reference sequence, de novo RNA-seq assembly is particularly useful for building new reference transcriptomes, detecting fusion genes, and discovering novel spliced transcripts. This is a challenging problem, and to address it at least eight approaches, including Trans-ABySS and Trinity, were developed within the past decade. For instance, using Trinity and 12 CPUs, it takes approximately one and a half day to assemble a human RNA-seq sample of over 100 million read pairs and requires up to 80 GB of memory. While the high memory usage typical of de novo RNA-seq assemblers may be alleviated by distributed computing, access to a high-performance computing environment is a requirement that may be limiting for smaller labs. In my thesis, I present a novel de novo RNA-seq assembler, “RNA-Bloom,” which utilizes compact data structures based on Bloom filters for the storage of k-mer counts and the de Bruijn graph in memory. Compared to Trans-ABySS and Trinity, RNA-Bloom can assemble a human transcriptome with comparable accuracy using nearly half as much memory and half the wall-clock time with 12 threads. === Science, Faculty of === Alumni === Graduate |
author |
Nip, Ka Ming |
spellingShingle |
Nip, Ka Ming RNA-Bloom : de novo RNA-seq assembly with Bloom filters |
author_facet |
Nip, Ka Ming |
author_sort |
Nip, Ka Ming |
title |
RNA-Bloom : de novo RNA-seq assembly with Bloom filters |
title_short |
RNA-Bloom : de novo RNA-seq assembly with Bloom filters |
title_full |
RNA-Bloom : de novo RNA-seq assembly with Bloom filters |
title_fullStr |
RNA-Bloom : de novo RNA-seq assembly with Bloom filters |
title_full_unstemmed |
RNA-Bloom : de novo RNA-seq assembly with Bloom filters |
title_sort |
rna-bloom : de novo rna-seq assembly with bloom filters |
publisher |
University of British Columbia |
publishDate |
2017 |
url |
http://hdl.handle.net/2429/62590 |
work_keys_str_mv |
AT nipkaming rnabloomdenovornaseqassemblywithbloomfilters |
_version_ |
1718585863098597376 |