Improved Optimization Algorithms for Operon Prediction in the Prokaryote Genome

碩士 === 國立高雄應用科技大學 === 資訊工程系 === 98 === Operons of bacterial genomes contain information valuable for drug design and determining protein functions. Co-transcribed genes likely have the same biological functions and directly affect each other; these genes are co-transcribed into a single-strand mRNA...

Full description

Bibliographic Details
Main Authors: Jui-Hung Tsai, 蔡瑞鴻
Other Authors: Cheng-Hong Yang
Format: Others
Language:en_US
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/53049675591468883097
Description
Summary:碩士 === 國立高雄應用科技大學 === 資訊工程系 === 98 === Operons of bacterial genomes contain information valuable for drug design and determining protein functions. Co-transcribed genes likely have the same biological functions and directly affect each other; these genes are co-transcribed into a single-strand mRNA sequence. However, knowledge of operons is scarce, and the experimental methods for predicting operons are generally difficult to implement. To gain better insight, a binary particle swarm optimization (BPSO) and genetic algorithm (GA) are improved, and used to predict operons in bacterial genomes. The intergenic distance and the gene strand condition were evaluated in the initiation step. By boosting the quality of particles from the initiation, the best particles can be obtained by successive progression through the generations. In this study, we calculated the logarithmic likelihood of Intergenic distance, metabolic pathway, cluster of orthologous groups gene function (COG) and operon length property in the Escherichia coli (NC_000913) genome as a fitness value of each gene. In addition, we also used gene length property in each benchmark genome to calculate a fitness value of each gene. Four bacterial genomes (Bacillus subtilis (NC_000964), Pseudomonas aeruginosa PA01 (NC_002516), Staphylococcus aureus (NC_002952)) and Mycobacterium tuberculosis (NC_000962) were selected as benchmark genomes of known operon structure. Experimental results show that the proposed method not only increases the accuracy of operon prediction on the four genome data sets are tested, but also reduces the computation time for operon prediction.