Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites.

Ubiquitin (Ub) is a small protein that consists of 76 amino acids about 8.5 kDa. In ubiquitin conjugation, the ubiquitin is majorly conjugated on the lysine residue of protein by Ub-ligating (E3) enzymes. Three major enzymes participate in ubiquitin conjugation. They are E1, E2 and E3 which are resp...

Full description

Bibliographic Details
Main Authors: Tzong-Yi Lee, Shu-An Chen, Hsin-Yi Hung, Yu-Yen Ou
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2011-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3052307?pdf=render
id doaj-b84b41bf09f4426da91b429cb8454ab7
record_format Article
spelling doaj-b84b41bf09f4426da91b429cb8454ab72020-11-25T02:32:12ZengPublic Library of Science (PLoS)PLoS ONE1932-62032011-01-0163e1733110.1371/journal.pone.0017331Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites.Tzong-Yi LeeShu-An ChenHsin-Yi HungYu-Yen OuUbiquitin (Ub) is a small protein that consists of 76 amino acids about 8.5 kDa. In ubiquitin conjugation, the ubiquitin is majorly conjugated on the lysine residue of protein by Ub-ligating (E3) enzymes. Three major enzymes participate in ubiquitin conjugation. They are E1, E2 and E3 which are responsible for activating, conjugating and ligating ubiquitin, respectively. Ubiquitin conjugation in eukaryotes is an important mechanism of the proteasome-mediated degradation of a protein and regulating the activity of transcription factors. Motivated by the importance of ubiquitin conjugation in biological processes, this investigation develops a method, UbSite, which uses utilizes an efficient radial basis function (RBF) network to identify protein ubiquitin conjugation (ubiquitylation) sites. This work not only investigates the amino acid composition but also the structural characteristics, physicochemical properties, and evolutionary information of amino acids around ubiquitylation (Ub) sites. With reference to the pathway of ubiquitin conjugation, the substrate sites for E3 recognition, which are distant from ubiquitylation sites, are investigated. The measurement of F-score in a large window size (-20∼+20) revealed a statistically significant amino acid composition and position-specific scoring matrix (evolutionary information), which are mainly located distant from Ub sites. The distant information can be used effectively to differentiate Ub sites from non-Ub sites. As determined by five-fold cross-validation, the model that was trained using the combination of amino acid composition and evolutionary information performs best in identifying ubiquitin conjugation sites. The prediction sensitivity, specificity, and accuracy are 65.5%, 74.8%, and 74.5%, respectively. Although the amino acid sequences around the ubiquitin conjugation sites do not contain conserved motifs, the cross-validation result indicates that the integration of distant sequence features of Ub sites can improve predictive performance. Additionally, the independent test demonstrates that the proposed method can outperform other ubiquitylation prediction tools.http://europepmc.org/articles/PMC3052307?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Tzong-Yi Lee
Shu-An Chen
Hsin-Yi Hung
Yu-Yen Ou
spellingShingle Tzong-Yi Lee
Shu-An Chen
Hsin-Yi Hung
Yu-Yen Ou
Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites.
PLoS ONE
author_facet Tzong-Yi Lee
Shu-An Chen
Hsin-Yi Hung
Yu-Yen Ou
author_sort Tzong-Yi Lee
title Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites.
title_short Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites.
title_full Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites.
title_fullStr Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites.
title_full_unstemmed Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites.
title_sort incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2011-01-01
description Ubiquitin (Ub) is a small protein that consists of 76 amino acids about 8.5 kDa. In ubiquitin conjugation, the ubiquitin is majorly conjugated on the lysine residue of protein by Ub-ligating (E3) enzymes. Three major enzymes participate in ubiquitin conjugation. They are E1, E2 and E3 which are responsible for activating, conjugating and ligating ubiquitin, respectively. Ubiquitin conjugation in eukaryotes is an important mechanism of the proteasome-mediated degradation of a protein and regulating the activity of transcription factors. Motivated by the importance of ubiquitin conjugation in biological processes, this investigation develops a method, UbSite, which uses utilizes an efficient radial basis function (RBF) network to identify protein ubiquitin conjugation (ubiquitylation) sites. This work not only investigates the amino acid composition but also the structural characteristics, physicochemical properties, and evolutionary information of amino acids around ubiquitylation (Ub) sites. With reference to the pathway of ubiquitin conjugation, the substrate sites for E3 recognition, which are distant from ubiquitylation sites, are investigated. The measurement of F-score in a large window size (-20∼+20) revealed a statistically significant amino acid composition and position-specific scoring matrix (evolutionary information), which are mainly located distant from Ub sites. The distant information can be used effectively to differentiate Ub sites from non-Ub sites. As determined by five-fold cross-validation, the model that was trained using the combination of amino acid composition and evolutionary information performs best in identifying ubiquitin conjugation sites. The prediction sensitivity, specificity, and accuracy are 65.5%, 74.8%, and 74.5%, respectively. Although the amino acid sequences around the ubiquitin conjugation sites do not contain conserved motifs, the cross-validation result indicates that the integration of distant sequence features of Ub sites can improve predictive performance. Additionally, the independent test demonstrates that the proposed method can outperform other ubiquitylation prediction tools.
url http://europepmc.org/articles/PMC3052307?pdf=render
work_keys_str_mv AT tzongyilee incorporatingdistantsequencefeaturesandradialbasisfunctionnetworkstoidentifyubiquitinconjugationsites
AT shuanchen incorporatingdistantsequencefeaturesandradialbasisfunctionnetworkstoidentifyubiquitinconjugationsites
AT hsinyihung incorporatingdistantsequencefeaturesandradialbasisfunctionnetworkstoidentifyubiquitinconjugationsites
AT yuyenou incorporatingdistantsequencefeaturesandradialbasisfunctionnetworkstoidentifyubiquitinconjugationsites
_version_ 1724820776280915968