Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins

碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === Tandem repeat structures are widely distributed among all classes of proteins. Various basic structural units of repetitive nature possess functional diversity and reflect important influences on protein interaction and biological responses for different organi...

Full description

Bibliographic Details
Main Authors: Cheng, Hung-Yi, 鄭弘翊
Other Authors: Pai, Tun-Wen
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/ds46bf
id ndltd-TW-104NTOU5394007
record_format oai_dc
spelling ndltd-TW-104NTOU53940072019-05-15T23:00:44Z http://ndltd.ncl.edu.tw/handle/ds46bf Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins α螺旋蛋白質內部重複單元切割與分類 Cheng, Hung-Yi 鄭弘翊 碩士 國立臺灣海洋大學 資訊工程學系 104 Tandem repeat structures are widely distributed among all classes of proteins. Various basic structural units of repetitive nature possess functional diversity and reflect important influences on protein interaction and biological responses for different organisms. One of the most common types of protein repeat structure is the α-solenoid tandem repeat, which possesses low sequence similarity between any two internal repeat units within a structure. Therefore, a successful segmentation and classification system for identifying α-solenoid repeats cannot be achieved mainly based on sequence alignment based approaches. For a comprehensive analysis on fundamental repeat unit segmentation, subclass identification, and functional annotation on such repeat structures, we have developed an automatic identification system according to geometrical characteristics and secondary structure information. Dihedral angles of Psi and Alpha were applied to define locations of candidate α helix elements, and the included angle between the vectors formulated by neighboring α helix element was calculated for constructing fundamental repeat units. Characteristics of length of helix elements, geometric curvatures, and relative position of neighboring repeat units were considered for classifying the subtypes of α-solenoid tandem repeats. To evaluate the performance of our developed prediction system, we employed three databases including 923 α-solenoid repeats collected in the RepeatsDB database, 905 α-solenoid repeats retrieved from CATH database, and 166 α-solenoid repeats collected from SMART/Pfam database. The results showed that our proposed system achieved a recall rate of 94.24%, precision rate 76.16%, specificity rate 99.76% and accuracy rate 99.71% for identifying α-solenoid repeats. Regarding internal repeat unit segmentation for identified repeats, the developed system achieved a recall rate of 94.20%, precision rate 94.66%, specificity rate 96.73% and accuracy rate 95.62%. For subtype classification, system could achieve a recall rate of 81.76%, precision rate 82.46%, specificity rate 96.06%, and accuracy rate 93.38%. This is the first comprehensive classification system for identifying four different subtypes of α-solenoid repeats, and including fundamental internal repeat segmentation and geometric annotation. The on-line recognition and friendly interface designed system could facilitate structural biologists for efficiently comparing common and unique features of different subtypes of α-solenoid tandem repeats, and it is beneficial for protein classification, annotation, and perhaps the biological experiments. Pai, Tun-Wen 白敦文 2015 學位論文 ; thesis 40 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === Tandem repeat structures are widely distributed among all classes of proteins. Various basic structural units of repetitive nature possess functional diversity and reflect important influences on protein interaction and biological responses for different organisms. One of the most common types of protein repeat structure is the α-solenoid tandem repeat, which possesses low sequence similarity between any two internal repeat units within a structure. Therefore, a successful segmentation and classification system for identifying α-solenoid repeats cannot be achieved mainly based on sequence alignment based approaches. For a comprehensive analysis on fundamental repeat unit segmentation, subclass identification, and functional annotation on such repeat structures, we have developed an automatic identification system according to geometrical characteristics and secondary structure information. Dihedral angles of Psi and Alpha were applied to define locations of candidate α helix elements, and the included angle between the vectors formulated by neighboring α helix element was calculated for constructing fundamental repeat units. Characteristics of length of helix elements, geometric curvatures, and relative position of neighboring repeat units were considered for classifying the subtypes of α-solenoid tandem repeats. To evaluate the performance of our developed prediction system, we employed three databases including 923 α-solenoid repeats collected in the RepeatsDB database, 905 α-solenoid repeats retrieved from CATH database, and 166 α-solenoid repeats collected from SMART/Pfam database. The results showed that our proposed system achieved a recall rate of 94.24%, precision rate 76.16%, specificity rate 99.76% and accuracy rate 99.71% for identifying α-solenoid repeats. Regarding internal repeat unit segmentation for identified repeats, the developed system achieved a recall rate of 94.20%, precision rate 94.66%, specificity rate 96.73% and accuracy rate 95.62%. For subtype classification, system could achieve a recall rate of 81.76%, precision rate 82.46%, specificity rate 96.06%, and accuracy rate 93.38%. This is the first comprehensive classification system for identifying four different subtypes of α-solenoid repeats, and including fundamental internal repeat segmentation and geometric annotation. The on-line recognition and friendly interface designed system could facilitate structural biologists for efficiently comparing common and unique features of different subtypes of α-solenoid tandem repeats, and it is beneficial for protein classification, annotation, and perhaps the biological experiments.
author2 Pai, Tun-Wen
author_facet Pai, Tun-Wen
Cheng, Hung-Yi
鄭弘翊
author Cheng, Hung-Yi
鄭弘翊
spellingShingle Cheng, Hung-Yi
鄭弘翊
Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins
author_sort Cheng, Hung-Yi
title Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins
title_short Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins
title_full Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins
title_fullStr Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins
title_full_unstemmed Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins
title_sort internal repeat segmentation and subtype classification of α-solenoid proteins
publishDate 2015
url http://ndltd.ncl.edu.tw/handle/ds46bf
work_keys_str_mv AT chenghungyi internalrepeatsegmentationandsubtypeclassificationofasolenoidproteins
AT zhènghóngyì internalrepeatsegmentationandsubtypeclassificationofasolenoidproteins
AT chenghungyi aluóxuándànbáizhìnèibùzhòngfùdānyuánqiègēyǔfēnlèi
AT zhènghóngyì aluóxuándànbáizhìnèibùzhòngfùdānyuánqiègēyǔfēnlèi
_version_ 1719138389813362688