Summary: | 博士 === 國立中興大學 === 資訊科學與工程學系 === 102 === MicroRNAs (miRNAs) are a group of small noncoding RNA (ncRNA) molecules that play an important role in gene regulation. In this dissertation, the interaction between the Ribonuclease III proteins (especially, Drosha and Dicer) and the primary miRNA and the precursor miRNA was analyzed. The statistics based on structural features were obtained and utilized to design the criteria for miRNA prediction. Also, a genetic algorithm was devised to locate the positions of the mature miRNA. This research had been applied to process some miRNA cluster sequences with lengths longer than 1 knt and correctly locate the positions of all mature miRNAs.
For the mass miRNA dataset, this research provides a mass-data microRNA prediction application, which was based on the multi-layer MapReduce framework and provided four prediction workflows for four different datasets: miRNA-like sequences, miRNA cluster sequences, unknown miRNA sequences and the next generation sequencing (NGS) sequences. These workflows included four core procedures for finding the genome location, the biological criteria filtering and a genetic algorithm based pre-miRNA classifier. Each procedure works as a MapReduce framework and uses JSON format to translate the MapReduce output to the next MapReduce procedure. The results show that the miRNA prediction method not only have high sensitivity and accuracy, but also have ability to process more than one million sequences in acceptable time by relying on the cloud computing system.
|