Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.

Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfa...

Full description

Bibliographic Details
Main Authors:	Andrew F Neuwald, Stephen F Altschul
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2016-12-01
Series:	PLoS Computational Biology
Online Access:	http://europepmc.org/articles/PMC5225019?pdf=render

id	doaj-1ff1735102f349bab8cb44c0d261f91e
record_format	Article
spelling	doaj-1ff1735102f349bab8cb44c0d261f91e2020-11-25T01:11:55ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582016-12-011212e100529410.1371/journal.pcbi.1005294Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.Andrew F NeuwaldStephen F AltschulOver evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs), which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes' theorem and Markov chain Monte Carlo (MCMC) sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu).http://europepmc.org/articles/PMC5225019?pdf=render
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Andrew F Neuwald Stephen F Altschul
spellingShingle	Andrew F Neuwald Stephen F Altschul Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations. PLoS Computational Biology
author_facet	Andrew F Neuwald Stephen F Altschul
author_sort	Andrew F Neuwald
title	Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.
title_short	Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.
title_full	Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.
title_fullStr	Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.
title_full_unstemmed	Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.
title_sort	inference of functionally-relevant n-acetyltransferase residues based on statistical correlations.
publisher	Public Library of Science (PLoS)
series	PLoS Computational Biology
issn	1553-734X 1553-7358
publishDate	2016-12-01
description	Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs), which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes' theorem and Markov chain Monte Carlo (MCMC) sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu).
url	http://europepmc.org/articles/PMC5225019?pdf=render
work_keys_str_mv	AT andrewfneuwald inferenceoffunctionallyrelevantnacetyltransferaseresiduesbasedonstatisticalcorrelations AT stephenfaltschul inferenceoffunctionallyrelevantnacetyltransferaseresiduesbasedonstatisticalcorrelations
_version_	1725168891308539904

Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.

Similar Items