Exploring the Bias Influences of Next-Generation-Sequencing for de novo Genome Assembly

碩士 === 國立成功大學 === 工程科學系碩博士班 === 100 ===  The next generation sequencing technology is a now important approach to decode the genome. Dealing with the millions of short reads had become a significant issue in the field of computing. In the recent years, a series of tools had been developed to assembl...

Full description

Bibliographic Details
Main Authors: Yen-ChunChen, 陳彥群
Other Authors: Chi-Chuan Hwang
Format: Others
Language:zh-TW
Published: 2012
Online Access:http://ndltd.ncl.edu.tw/handle/06246308292030359861
id ndltd-TW-100NCKU5028118
record_format oai_dc
spelling ndltd-TW-100NCKU50281182015-10-13T21:38:04Z http://ndltd.ncl.edu.tw/handle/06246308292030359861 Exploring the Bias Influences of Next-Generation-Sequencing for de novo Genome Assembly 探討次世代定序技術的定序偏差對全新基因體組裝的影響 Yen-ChunChen 陳彥群 碩士 國立成功大學 工程科學系碩博士班 100  The next generation sequencing technology is a now important approach to decode the genome. Dealing with the millions of short reads had become a significant issue in the field of computing. In the recent years, a series of tools had been developed to assembly the huge amount of fragments into more continuous sequences. However, the inherent sequencing bias may reduce the performance of assembly. The effects of bias on assembly have not been systematically discussed in the past.  In this study, we simulate reads with specific degree of sequencing bias and error rate profile for S.aureus, E.coli, M.tuberculosis, Arabidopsis thaliana Chr.1 and Oryza sativa Chr.5. We consider various scenario of bias for each assembler including ALLPATHS-LG, ABySS, Edena, SOAPdenovo, SSAKE, Velvet and Velvet-SC and employ an assembly evaluating tool, GAGE, to discuss the assemblies by both N50 length and accuracy.  The biased data sets will lead the fracture and error within assemblies. The regions with low read coverage are either unable to be assembled or produce the sequence contain SNPs, Indels or reconstructions. Although the most assemblers are capable to deal with small degree of bias within bacterial data, the bias result much deeper impact for the more complex plant genome. The reasonable amount of reads plays an important role to mitigate the bias. This study provides a novel landscape of assembly for the relationship between the coverage and sequencing bias. Chi-Chuan Hwang 黃吉川 2012 學位論文 ; thesis 134 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立成功大學 === 工程科學系碩博士班 === 100 ===  The next generation sequencing technology is a now important approach to decode the genome. Dealing with the millions of short reads had become a significant issue in the field of computing. In the recent years, a series of tools had been developed to assembly the huge amount of fragments into more continuous sequences. However, the inherent sequencing bias may reduce the performance of assembly. The effects of bias on assembly have not been systematically discussed in the past.  In this study, we simulate reads with specific degree of sequencing bias and error rate profile for S.aureus, E.coli, M.tuberculosis, Arabidopsis thaliana Chr.1 and Oryza sativa Chr.5. We consider various scenario of bias for each assembler including ALLPATHS-LG, ABySS, Edena, SOAPdenovo, SSAKE, Velvet and Velvet-SC and employ an assembly evaluating tool, GAGE, to discuss the assemblies by both N50 length and accuracy.  The biased data sets will lead the fracture and error within assemblies. The regions with low read coverage are either unable to be assembled or produce the sequence contain SNPs, Indels or reconstructions. Although the most assemblers are capable to deal with small degree of bias within bacterial data, the bias result much deeper impact for the more complex plant genome. The reasonable amount of reads plays an important role to mitigate the bias. This study provides a novel landscape of assembly for the relationship between the coverage and sequencing bias.
author2 Chi-Chuan Hwang
author_facet Chi-Chuan Hwang
Yen-ChunChen
陳彥群
author Yen-ChunChen
陳彥群
spellingShingle Yen-ChunChen
陳彥群
Exploring the Bias Influences of Next-Generation-Sequencing for de novo Genome Assembly
author_sort Yen-ChunChen
title Exploring the Bias Influences of Next-Generation-Sequencing for de novo Genome Assembly
title_short Exploring the Bias Influences of Next-Generation-Sequencing for de novo Genome Assembly
title_full Exploring the Bias Influences of Next-Generation-Sequencing for de novo Genome Assembly
title_fullStr Exploring the Bias Influences of Next-Generation-Sequencing for de novo Genome Assembly
title_full_unstemmed Exploring the Bias Influences of Next-Generation-Sequencing for de novo Genome Assembly
title_sort exploring the bias influences of next-generation-sequencing for de novo genome assembly
publishDate 2012
url http://ndltd.ncl.edu.tw/handle/06246308292030359861
work_keys_str_mv AT yenchunchen exploringthebiasinfluencesofnextgenerationsequencingfordenovogenomeassembly
AT chényànqún exploringthebiasinfluencesofnextgenerationsequencingfordenovogenomeassembly
AT yenchunchen tàntǎocìshìdàidìngxùjìshùdedìngxùpiānchàduìquánxīnjīyīntǐzǔzhuāngdeyǐngxiǎng
AT chényànqún tàntǎocìshìdàidìngxùjìshùdedìngxùpiānchàduìquánxīnjīyīntǐzǔzhuāngdeyǐngxiǎng
_version_ 1718067325949706240