Genome Reassembler for Low-coverage Sequencing Regions

碩士 === 國立中正大學 === 資訊工程研究所 === 105 === Third-generation sequencing (TGS) can produce much longer reads within shorter turnaround time, which is becoming the preferred choice for de novo genome assembly. Unfortunately, most large-genome sequencing projects are not affordable of sufficient sequencing d...

Full description

Bibliographic Details
Main Authors: Lin, Jyun-Hong, 林駿宏
Other Authors: Huang, Yao-Ting
Format: Others
Language:en_US
Published: 2017
Online Access:http://ndltd.ncl.edu.tw/handle/eg2d86
Description
Summary:碩士 === 國立中正大學 === 資訊工程研究所 === 105 === Third-generation sequencing (TGS) can produce much longer reads within shorter turnaround time, which is becoming the preferred choice for de novo genome assembly. Unfortunately, most large-genome sequencing projects are not affordable of sufficient sequencing depth required by existing TGS assemblers. Even the desired sequencing depth is achieved, the coverage across the entire genome is still uneven. This thesis designs and implements a genome re-assembler which aims to improve existing TGS assemblers by recovering missing overlap and performing re-assembly in low-coverage regions. A novel dimensional-reduction technique is developed for efficient overlap computation tailored for low-coverage regions. Experimental results indicated that our method improves assembly of Canu, Falcon, and Miniasm under low-sequencing depth ($\le$ 30x). For large genomes, the results of low- and high-sequencing depth are both improved probably owing to larger coverage variance.