USING TEXT MINING TECHNIQUES TO EXTRACT REGULATION BETWEEN TRANSCRIPTION FACTOR AND TARGET GENE

碩士 === 國立成功大學 === 資訊管理研究所 === 96 === Human genome sequences have completely decoded. The data is helpful to the gene identification and gene regulation. In gene regulation research, it includes regulation information between transcription factor(TF) and target gene(TGene) that may help biologists to...

Full description

Bibliographic Details
Main Authors: Tock-kheng Kooi, 桂卓慶
Other Authors: Hei-Chia Wang
Format: Others
Language:zh-TW
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/05181713194049798191
Description
Summary:碩士 === 國立成功大學 === 資訊管理研究所 === 96 === Human genome sequences have completely decoded. The data is helpful to the gene identification and gene regulation. In gene regulation research, it includes regulation information between transcription factor(TF) and target gene(TGene) that may help biologists to know which TGene is regulated by the TF. Presently, regulation information mostly is recorded in biological literatures. Due to the rapid growth of biological literature, biologists hardly spend lot of time to read through all related literatures and extract regulation information between TF and TGene. Therefore, if any information technology can be utilized to filter and extract relationship between TF and TGene that may improve the reading efficiency. Nowadays, most researchers put their every effort in protein-protein interactions research, but this thesis is specialized to extract regulation between TF and TGene. The difficulties are (1) named entity recognition need two domain dictionary (2) relation recognition must conscientiously defined. As an example, TF can only regulate TGene expression but TGene cannot. Besides that, most researchers focus on extracting important information but less aware modality information like “Previous Studies…” means that studies are some time ago, no experiment evidence in that paper. Therefore, this thesis aims to use text-mining technique to analyze TF query literatures from PubMed , use negative and positive pattern to predict the relationship between TF and target gene that may give valuable insight to the biologists.