The integrated bioinformatics study for computational systems biology

博士 === 國立中興大學 === 資訊科學系所 === 95 === Systems biology is the study integrated various biological systems such as genomics, transcriptomics, proteomics, gene regulatory networks, and pathways, in which bioinformatics is an integrating tool to perform the systematic data analysis and the construction of...

Full description

Bibliographic Details
Main Authors: Chun-Chi Liu, 劉俊吉
Other Authors: Wen-Shyen E. Chen
Format: Others
Language:en_US
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/98148240492297962568
Description
Summary:博士 === 國立中興大學 === 資訊科學系所 === 95 === Systems biology is the study integrated various biological systems such as genomics, transcriptomics, proteomics, gene regulatory networks, and pathways, in which bioinformatics is an integrating tool to perform the systematic data analysis and the construction of biological models. This study contains specific oligonucleotide (oligo) identification, microarray probe design, microarray data analysis, promoter analysis, genome context analysis, transcription factor (TF) regulation prediction, microRNA (miRNA) regulation prediction, pathway analysis, gene network construction, complex regulation prediction of miRNAs and TFs, and so on. In this dissertation, the architecture consists of three major components as follows: (i) identification of specific oligos; (ii) development of the method in microarray data analysis; and (iii) integration of comprehensive biological databases. The first component is genome-wide identification of the specific oligos, which can be employed in microarray probe design, primer design, and siRNA design. An artificial neural network (ANN) is a popular learning approach that effectively handles noise and complex relationships in a robust way. In this dissertation, the ANN has been utilized to integrate the 10-mer ~ 26-mer densities of unique subsequence. We presented a novel and efficient algorithm that integrates the ANN and BLAST, named IAB algorithm, to identify the specific oligos. The performance of the IAB algorithm was about 5-7 times faster than the BLAST search without ANN in the experimental results. The second component is the development of the novel method in microarray data analysis, which can be employed in cancer classification and pathway analysis. Cancer classification and pathway analysis are promising methods to discover the underlying molecular mechanisms by using microarray data. However, linking molecular classification and pathway analysis by gene network approach has not been discussed yet. After continuous investigation, we discovered that the inside of the gene networks have information for cancer classification and pathway analysis. In this dissertation, we developed a novel framework that can construct the class-specific gene networks for classification and pathway analysis, which includes a novel network construction, named ordering networks. Thus, the topology-based classification and pathway analysis have been developed in this dissertation. The accuracy and stability of classification performance, the limitation of linear relationship, the power-law property, the time complexity of network construction, and literature studies have been investigated. Our results suggest that the ordering network construction has outstanding performances. The integration of comprehensive biological databases and the study of complex regulation of TFs and miRNAs are the third component. TFs and miRNAs play important roles in regulation of gene expression and the study on their combinatory regulations of gene expression is a new research field. In this dissertation, we constructed a comprehensive web server, named composite regulatory signature database (CRSD), that integrates UniGene, miRBase, promoter, TRANSFAC, pathway, gene ontology, and genome databases. To accomplish the data analysis of microarray at one go, several methods including the microarray data pretreatment, statistic and clustering analysis, iterative enrichment analysis, and motif discovery were closely integrated in the web server. We further extended and enhanced the framework of CRSD to develop the plant composite regulatory signature database (PlantCRSD), which includes comprehensive annotations of plant miRNAs for Arabidopsis thaliana and Oryza sativa. This study is a framework to provide integrated and comprehensive knowledge for computational systems biology, which contains multiple levels and various fields in bioinformatics. The novel method of microarray probe design can help biological experiments to obtain the microarray data. The topology-based cancer classification and pathway analysis contribute a new way to the application of gene networks, and they may provide a criterion to evaluate the accuracy of gene networks. Our comprehensive web server is an integrative platform for systems biology study. This server has the predictions of genome-wide biological behaviors and the user-friendly interface, which may constribute to worldwide academic interconnection, knowledge base establishment, and mutual validation.