Summary: | 博士 === 中華大學 === 工程科學博士學位學程 === 98 === With the rapid development of science in many fields, there are many information need to be processed and analyzed. Although the development of computers is very fast, but computing power always cannot keep up with the growth of data. Therefore, connect computing units is a good strategy to obtain higher performance with lower cost. In this dissertation, we concerned about three important problems in Bioinformatics, Cheminformatics, and data mining field, that are Ultrametric tree construction, Chemical compound inference, and frequent pattern mining respectively. Ultrametric tree can assist biologists to observe the relationship among species. Moreover, it is also used in orthologous-domain classification and multiple sequence alignment. Chemical compound inference could produce compounds with the same given constrains, and the applications include reconstructing molecular structure, classification of compounds, and may be useful for drug design. Frequent pattern is a fundamental processing in association rules mining, time series mining, classification, and etc. The computation power daemon of above-mentioned topics is increase rapidly in growth of the amount of data. Therefore, we use parallel and distributed strategy to solve these problems in this dissertation. Moreover, we considered the critical issue in parallel computing--load balancing. For each topic, we proposed corresponding facilities to the load balancing issue. In order to verify the performance of proposed algorithms, we implemented each of them. The experimental results show that proposed algorithms could reduce the computation time on multiple computing units system and have good speed-up ratio.
|