Bi-relational Network Analysis between Local Structural Similarity and Disease Using Random Walk with Restart

碩士 === 中華大學 === 生物資訊學系碩士班 === 102 === In related research to explore pathogenetic mechanism, most of methods is to use PPI network for predicting the relationship between sequence similarities and diseases. In this research thesis, we provided a new point of view to use the local structure similar...

Full description

Bibliographic Details
Main Authors: Jih-Hsu Chang, 張日旭
Other Authors: Chi-Hua Tung
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/10396628596409604210
Description
Summary:碩士 === 中華大學 === 生物資訊學系碩士班 === 102 === In related research to explore pathogenetic mechanism, most of methods is to use PPI network for predicting the relationship between sequence similarities and diseases. In this research thesis, we provided a new point of view to use the local structure similarity for constructing the network. And then, the network was applied on an approach, named as Random Walk with Restart (RWR), for measuring proximity of network nodes and assessing the association between proteins and disease. We selected a non-redundancy data set which is composed of protein structures from Homo sapiens as training set and query the OMIM database for checking the presence of pathogenic. After transforming the protein structures into the structural alphabet sequences, we distinguished the "Unit of Structural Alphabet" (USA) which is composed of three segments including two secondary structures and one loop to represent local structural features of protein. Subsequently, we used 3D-BLAST rapid structural alignment to build a USA similarity network. After having pathogenic relationship of protein and local structural similarity network, we used RWR algorithms to calculate the proximity of each USA and decided the likelihood threshold λ of pathogenic to predict disease. Our results show that the sensitivity, specificity, and F1-score are 0.855, 0.971, and 0.8874 respectively when threshold λ is by 0.45. It demonstrates the local structural similarity network in this study can reasonably investigate the association of protein with the disease, and can effectively predict what pathogenic possibilities the unknown protein may have. In the practical application of disease prediction, this study was without consideration of protein interactions, but also whole protein structure alignment. There is only minor comparison to the USA for quickly assessing its pathogenic possibilities. In future, this research can be further expanded to explore the relevance of protein structure-disease-drugs-side effects in order to strengthen practical clinical applications and enhance human well-being.