Prediction of contact residue pairs based on co-substitution between sites in protein structures.

Residue-residue interactions that fold a protein into a unique three-dimensional structure and make it play a specific function impose structural and functional constraints in varying degrees on each residue site. Selective constraints on residue sites are recorded in amino acid orders in homologous...

Full description

Bibliographic Details
Main Author: Sanzo Miyazawa
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3546969?pdf=render
id doaj-50a0e3ce07994709ab9bbecdc973e62a
record_format Article
spelling doaj-50a0e3ce07994709ab9bbecdc973e62a2020-11-25T02:53:47ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-0181e5425210.1371/journal.pone.0054252Prediction of contact residue pairs based on co-substitution between sites in protein structures.Sanzo MiyazawaResidue-residue interactions that fold a protein into a unique three-dimensional structure and make it play a specific function impose structural and functional constraints in varying degrees on each residue site. Selective constraints on residue sites are recorded in amino acid orders in homologous sequences and also in the evolutionary trace of amino acid substitutions. A challenge is to extract direct dependences between residue sites by removing phylogenetic correlations and indirect dependences through other residues within a protein or even through other molecules. Rapid growth of protein families with unknown folds requires an accurate de novo prediction method for protein structure. Recent attempts of disentangling direct from indirect dependences of amino acid types between residue positions in multiple sequence alignments have revealed that inferred residue-residue proximities can be sufficient information to predict a protein fold without the use of known three-dimensional structures. Here, we propose an alternative method of inferring coevolving site pairs from concurrent and compensatory substitutions between sites in each branch of a phylogenetic tree. Substitution probability and physico-chemical changes (volume, charge, hydrogen-bonding capability, and others) accompanied by substitutions at each site in each branch of a phylogenetic tree are estimated with the likelihood of each substitution, and their direct correlations between sites are used to detect concurrent and compensatory substitutions. In order to extract direct dependences between sites, partial correlation coefficients of the characteristic changes along branches between sites, in which linear multiple dependences on feature vectors at other sites are removed, are calculated and used to rank coevolving site pairs. Accuracy of contact prediction based on the present coevolution score is comparable to that achieved by a maximum entropy model of protein sequences for 15 protein families taken from the Pfam release 26.0. Besides, this excellent accuracy indicates that compensatory substitutions are significant in protein evolution.http://europepmc.org/articles/PMC3546969?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Sanzo Miyazawa
spellingShingle Sanzo Miyazawa
Prediction of contact residue pairs based on co-substitution between sites in protein structures.
PLoS ONE
author_facet Sanzo Miyazawa
author_sort Sanzo Miyazawa
title Prediction of contact residue pairs based on co-substitution between sites in protein structures.
title_short Prediction of contact residue pairs based on co-substitution between sites in protein structures.
title_full Prediction of contact residue pairs based on co-substitution between sites in protein structures.
title_fullStr Prediction of contact residue pairs based on co-substitution between sites in protein structures.
title_full_unstemmed Prediction of contact residue pairs based on co-substitution between sites in protein structures.
title_sort prediction of contact residue pairs based on co-substitution between sites in protein structures.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2013-01-01
description Residue-residue interactions that fold a protein into a unique three-dimensional structure and make it play a specific function impose structural and functional constraints in varying degrees on each residue site. Selective constraints on residue sites are recorded in amino acid orders in homologous sequences and also in the evolutionary trace of amino acid substitutions. A challenge is to extract direct dependences between residue sites by removing phylogenetic correlations and indirect dependences through other residues within a protein or even through other molecules. Rapid growth of protein families with unknown folds requires an accurate de novo prediction method for protein structure. Recent attempts of disentangling direct from indirect dependences of amino acid types between residue positions in multiple sequence alignments have revealed that inferred residue-residue proximities can be sufficient information to predict a protein fold without the use of known three-dimensional structures. Here, we propose an alternative method of inferring coevolving site pairs from concurrent and compensatory substitutions between sites in each branch of a phylogenetic tree. Substitution probability and physico-chemical changes (volume, charge, hydrogen-bonding capability, and others) accompanied by substitutions at each site in each branch of a phylogenetic tree are estimated with the likelihood of each substitution, and their direct correlations between sites are used to detect concurrent and compensatory substitutions. In order to extract direct dependences between sites, partial correlation coefficients of the characteristic changes along branches between sites, in which linear multiple dependences on feature vectors at other sites are removed, are calculated and used to rank coevolving site pairs. Accuracy of contact prediction based on the present coevolution score is comparable to that achieved by a maximum entropy model of protein sequences for 15 protein families taken from the Pfam release 26.0. Besides, this excellent accuracy indicates that compensatory substitutions are significant in protein evolution.
url http://europepmc.org/articles/PMC3546969?pdf=render
work_keys_str_mv AT sanzomiyazawa predictionofcontactresiduepairsbasedoncosubstitutionbetweensitesinproteinstructures
_version_ 1724724453016862720