Summary: | 碩士 === 逢甲大學 === 資訊工程所 === 94 === With the development of computer and information technology, many researches of biology can be facilitated by computer software. Computer software can speed up researches of biology and analysis of biological data. However, the completion of the Human Genome Project (HGP) promotes the development of other related researches. Those data of sequences need to be analyzed and explored to discover the potential mechanisms of life. Hence, tools that can assist the analysis are needed. Based on the requirement, we hope to provide a method that can align the ESTs to the genome. Yet, the human genome contains the repetitive sequences that hold one-tenth of the human genome. And in the past, most of the associated researches cannot handle those repetitive sequences well, and even cannot deal with those sequences. Hence, our research hopes to handle both those repetitive and unique sequences in the genome to make all ESTs can be aligned to the correct regions. And we can employ the results to have an advance research and analysis. Besides, the human ESTs in dbEST have achieved the number of 7,678,812. If we align the entire ESTs in dbEST, it costs much time. Thus, we provide different strategies that can save time and get results within an acceptable correctness to align a single EST and the entire ESTs in dbEST to the genome. We consider the low frequency and high density index problem to provide the EST to locate to the genome. And then, we propose a heuristic algorithm and employ MUGUP to check our research with different test sets of ESTs.
|