Summary: | 碩士 === 國立交通大學 === 生物資訊及系統生物研究所 === 102 === MicroRNAs (miRNAs) are small non-coding RNAs of ~22 nucleotides that play an important role for most organisms through regulating gene expression in the post-transcriptional level. By base-pairing with 3’ untranslated region (3’-UTR) of mRNA, miRNAs function as degradation of mRNAs and repression of translation to achieve gene silencing. Owing to the importance of miRNA, computational prediction of miRNA-mRNA pairs is entry for us to learn the relationship between miRNAs and its targets.
However, existing methods strongly focused on the seed-region complementarity or the cross-species conservation. Even though there has been significant progress on miRNA target prediction algorithm, it still have room for improvement. So in this study, we aim to construct a model to predict the miRNA-mRNA interaction with high accuracy. We apply GA-SVM algorithm which combines SVM (support vector machine) and GA (genetic algorithm) to increase prediction accuracy of miRNA target classification through select optimal feature subset. High-throughput datasets of miRNA-mRNA interaction are utilized for training and testing. Furthermore, data coming from miRTarBase were used for testing to improve the performance. The performance of the model is further evaluated by independent set and compare to other algorithms. In addition, the datasets are carried out for further analysis of miRNA-mRNA interaction and characteristics.
In conclusion, we constructed a comprehensive dataset that comes from different methods especially negative data which was generated by the expression profiles from TCGA through selecting high Pearson correlation coefficient of miRNA-target pairs. A GA-SVM model was built for miRNA target prediction. Several information about miRNA-target interaction were taken into account and it lay a foundation for researchers to investigate miRNA and target interaction.
|