Exhaustive Discovery of Escherichia coli K-12 MG1655 Missing Genes Using Comparative Genomics Approaches

碩士 === 國立陽明大學 === 生物資訊研究所 === 94 === The numbers of sequenced bacterial genomes have been increasing rapidly over the past decade and many gene prediction programs like Glimmer, GeneMark and ZCurve were published. However, there are many genes may not be properly predicted by those programs. In thes...

Full description

Bibliographic Details
Main Authors: Shan-Chun Yang, 楊善淳
Other Authors: Chuan-Hsiung Chang
Format: Others
Language:zh-TW
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/29992441771085539490
id ndltd-TW-094YM005112007
record_format oai_dc
spelling ndltd-TW-094YM0051120072015-10-13T16:31:17Z http://ndltd.ncl.edu.tw/handle/29992441771085539490 Exhaustive Discovery of Escherichia coli K-12 MG1655 Missing Genes Using Comparative Genomics Approaches 利用比較基因體學方法徹底搜尋大腸桿菌K-12MG1655之未知基因 Shan-Chun Yang 楊善淳 碩士 國立陽明大學 生物資訊研究所 94 The numbers of sequenced bacterial genomes have been increasing rapidly over the past decade and many gene prediction programs like Glimmer, GeneMark and ZCurve were published. However, there are many genes may not be properly predicted by those programs. In thesis, we developed an ab initio method for bacterial gene prediction. We first obtained the candidate genes that are highly conserved in bacterial genomes except those in the E. coli. Because these highly conserved genes are assumed to be encoded as functional proteins through cross-species sequence comparison. Next, we filtered the candidate gene pool by removing overlapping genes and genes that do not have proper translation start site and having abnormal coding potential. In the result, there are 4,436 genes were predicted in E. coli K-12 MG1655. The sensitivity and specificity are 90.56% and 87.98% respectively. There are also 533 novel genes were reported by using our gene prediction method which are never found in other prediction methods. Eight of these novel genes are longer than 300bp and having ribosomal binding sites at the upstream regions. They are found to be conserved within two different species and were not overlapped with rRNA genes, tRNA genes or pseudogenes. In future work, wet lab experimental verification is required to identify the authenticity of these novel genes. This method can be applied to those currently available and upcoming bacterial genomes. The E. coli K12 gene prediction results are available on the web and can be accessed from: http://140.129.78.155/services/BGF/. Chuan-Hsiung Chang 張傳雄 2006 學位論文 ; thesis 112 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立陽明大學 === 生物資訊研究所 === 94 === The numbers of sequenced bacterial genomes have been increasing rapidly over the past decade and many gene prediction programs like Glimmer, GeneMark and ZCurve were published. However, there are many genes may not be properly predicted by those programs. In thesis, we developed an ab initio method for bacterial gene prediction. We first obtained the candidate genes that are highly conserved in bacterial genomes except those in the E. coli. Because these highly conserved genes are assumed to be encoded as functional proteins through cross-species sequence comparison. Next, we filtered the candidate gene pool by removing overlapping genes and genes that do not have proper translation start site and having abnormal coding potential. In the result, there are 4,436 genes were predicted in E. coli K-12 MG1655. The sensitivity and specificity are 90.56% and 87.98% respectively. There are also 533 novel genes were reported by using our gene prediction method which are never found in other prediction methods. Eight of these novel genes are longer than 300bp and having ribosomal binding sites at the upstream regions. They are found to be conserved within two different species and were not overlapped with rRNA genes, tRNA genes or pseudogenes. In future work, wet lab experimental verification is required to identify the authenticity of these novel genes. This method can be applied to those currently available and upcoming bacterial genomes. The E. coli K12 gene prediction results are available on the web and can be accessed from: http://140.129.78.155/services/BGF/.
author2 Chuan-Hsiung Chang
author_facet Chuan-Hsiung Chang
Shan-Chun Yang
楊善淳
author Shan-Chun Yang
楊善淳
spellingShingle Shan-Chun Yang
楊善淳
Exhaustive Discovery of Escherichia coli K-12 MG1655 Missing Genes Using Comparative Genomics Approaches
author_sort Shan-Chun Yang
title Exhaustive Discovery of Escherichia coli K-12 MG1655 Missing Genes Using Comparative Genomics Approaches
title_short Exhaustive Discovery of Escherichia coli K-12 MG1655 Missing Genes Using Comparative Genomics Approaches
title_full Exhaustive Discovery of Escherichia coli K-12 MG1655 Missing Genes Using Comparative Genomics Approaches
title_fullStr Exhaustive Discovery of Escherichia coli K-12 MG1655 Missing Genes Using Comparative Genomics Approaches
title_full_unstemmed Exhaustive Discovery of Escherichia coli K-12 MG1655 Missing Genes Using Comparative Genomics Approaches
title_sort exhaustive discovery of escherichia coli k-12 mg1655 missing genes using comparative genomics approaches
publishDate 2006
url http://ndltd.ncl.edu.tw/handle/29992441771085539490
work_keys_str_mv AT shanchunyang exhaustivediscoveryofescherichiacolik12mg1655missinggenesusingcomparativegenomicsapproaches
AT yángshànchún exhaustivediscoveryofescherichiacolik12mg1655missinggenesusingcomparativegenomicsapproaches
AT shanchunyang lìyòngbǐjiàojīyīntǐxuéfāngfǎchèdǐsōuxúndàchánggǎnjūnk12mg1655zhīwèizhījīyīn
AT yángshànchún lìyòngbǐjiàojīyīntǐxuéfāngfǎchèdǐsōuxúndàchánggǎnjūnk12mg1655zhīwèizhījīyīn
_version_ 1717771765077245952