Summary: | 碩士 === 國立中正大學 === 資訊工程研究所 === 103 === In recent years, as many genomes have been sequenced and assembled, the newly-sequenced genomes are often closely-related to an existing genome. However, owing to complex repeat structures in the genome, the genomes assembled by existing methods are often highly fragmented. In this thesis, we design a semi-assembly approach (called SemiAssembler) which integrate reference-mapping approaches and de novo assembly to reconstruct a newly-sequenced genome using closely-related genome sequences. A draft genome is first created by adding (removing) inter-species insertions (deletions) to (from) the related genome, respectively. Subsequently, the draft genome sequence is replaced with the contig sequences assembled from short reads, which aims to reflect inter-species SNPs and small-sized indels. Simulation results indicated our method has high precision and recall rates. The program is used to assemble two O. Sativa genomes. A substantial amount of large insertions/deletions and small indels found by our method were validated by PCR.
|