Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins
碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === Tandem repeat structures are widely distributed among all classes of proteins. Various basic structural units of repetitive nature possess functional diversity and reflect important influences on protein interaction and biological responses for different organi...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2015
|
Online Access: | http://ndltd.ncl.edu.tw/handle/ds46bf |
id |
ndltd-TW-104NTOU5394007 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-104NTOU53940072019-05-15T23:00:44Z http://ndltd.ncl.edu.tw/handle/ds46bf Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins α螺旋蛋白質內部重複單元切割與分類 Cheng, Hung-Yi 鄭弘翊 碩士 國立臺灣海洋大學 資訊工程學系 104 Tandem repeat structures are widely distributed among all classes of proteins. Various basic structural units of repetitive nature possess functional diversity and reflect important influences on protein interaction and biological responses for different organisms. One of the most common types of protein repeat structure is the α-solenoid tandem repeat, which possesses low sequence similarity between any two internal repeat units within a structure. Therefore, a successful segmentation and classification system for identifying α-solenoid repeats cannot be achieved mainly based on sequence alignment based approaches. For a comprehensive analysis on fundamental repeat unit segmentation, subclass identification, and functional annotation on such repeat structures, we have developed an automatic identification system according to geometrical characteristics and secondary structure information. Dihedral angles of Psi and Alpha were applied to define locations of candidate α helix elements, and the included angle between the vectors formulated by neighboring α helix element was calculated for constructing fundamental repeat units. Characteristics of length of helix elements, geometric curvatures, and relative position of neighboring repeat units were considered for classifying the subtypes of α-solenoid tandem repeats. To evaluate the performance of our developed prediction system, we employed three databases including 923 α-solenoid repeats collected in the RepeatsDB database, 905 α-solenoid repeats retrieved from CATH database, and 166 α-solenoid repeats collected from SMART/Pfam database. The results showed that our proposed system achieved a recall rate of 94.24%, precision rate 76.16%, specificity rate 99.76% and accuracy rate 99.71% for identifying α-solenoid repeats. Regarding internal repeat unit segmentation for identified repeats, the developed system achieved a recall rate of 94.20%, precision rate 94.66%, specificity rate 96.73% and accuracy rate 95.62%. For subtype classification, system could achieve a recall rate of 81.76%, precision rate 82.46%, specificity rate 96.06%, and accuracy rate 93.38%. This is the first comprehensive classification system for identifying four different subtypes of α-solenoid repeats, and including fundamental internal repeat segmentation and geometric annotation. The on-line recognition and friendly interface designed system could facilitate structural biologists for efficiently comparing common and unique features of different subtypes of α-solenoid tandem repeats, and it is beneficial for protein classification, annotation, and perhaps the biological experiments. Pai, Tun-Wen 白敦文 2015 學位論文 ; thesis 40 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === Tandem repeat structures are widely distributed among all classes of proteins. Various basic structural units of repetitive nature possess functional diversity and reflect important influences on protein interaction and biological responses for different organisms. One of the most common types of protein repeat structure is the α-solenoid tandem repeat, which possesses low sequence similarity between any two internal repeat units within a structure. Therefore, a successful segmentation and classification system for identifying α-solenoid repeats cannot be achieved mainly based on sequence alignment based approaches. For a comprehensive analysis on fundamental repeat unit segmentation, subclass identification, and functional annotation on such repeat structures, we have developed an automatic identification system according to geometrical characteristics and secondary structure information. Dihedral angles of Psi and Alpha were applied to define locations of candidate α helix elements, and the included angle between the vectors formulated by neighboring α helix element was calculated for constructing fundamental repeat units. Characteristics of length of helix elements, geometric curvatures, and relative position of neighboring repeat units were considered for classifying the subtypes of α-solenoid tandem repeats. To evaluate the performance of our developed prediction system, we employed three databases including 923 α-solenoid repeats collected in the RepeatsDB database, 905 α-solenoid repeats retrieved from CATH database, and 166 α-solenoid repeats collected from SMART/Pfam database. The results showed that our proposed system achieved a recall rate of 94.24%, precision rate 76.16%, specificity rate 99.76% and accuracy rate 99.71% for identifying α-solenoid repeats. Regarding internal repeat unit segmentation for identified repeats, the developed system achieved a recall rate of 94.20%, precision rate 94.66%, specificity rate 96.73% and accuracy rate 95.62%. For subtype classification, system could achieve a recall rate of 81.76%, precision rate 82.46%, specificity rate 96.06%, and accuracy rate 93.38%. This is the first comprehensive classification system for identifying four different subtypes of α-solenoid repeats, and including fundamental internal repeat segmentation and geometric annotation. The on-line recognition and friendly interface designed system could facilitate structural biologists for efficiently comparing common and unique features of different subtypes of α-solenoid tandem repeats, and it is beneficial for protein classification, annotation, and perhaps the biological experiments.
|
author2 |
Pai, Tun-Wen |
author_facet |
Pai, Tun-Wen Cheng, Hung-Yi 鄭弘翊 |
author |
Cheng, Hung-Yi 鄭弘翊 |
spellingShingle |
Cheng, Hung-Yi 鄭弘翊 Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins |
author_sort |
Cheng, Hung-Yi |
title |
Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins |
title_short |
Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins |
title_full |
Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins |
title_fullStr |
Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins |
title_full_unstemmed |
Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins |
title_sort |
internal repeat segmentation and subtype classification of α-solenoid proteins |
publishDate |
2015 |
url |
http://ndltd.ncl.edu.tw/handle/ds46bf |
work_keys_str_mv |
AT chenghungyi internalrepeatsegmentationandsubtypeclassificationofasolenoidproteins AT zhènghóngyì internalrepeatsegmentationandsubtypeclassificationofasolenoidproteins AT chenghungyi aluóxuándànbáizhìnèibùzhòngfùdānyuánqiègēyǔfēnlèi AT zhènghóngyì aluóxuándànbáizhìnèibùzhòngfùdānyuánqiègēyǔfēnlèi |
_version_ |
1719138389813362688 |