Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis

As a DNA binding transcriptional activator, Gal4 promotes the expression of genes responsible for galactose metabolism. The Gal4 protein from Saccharomyces cerevisiae (baker’s yeast) has become a model for studying eukaryotic transcriptional activation in general because its regulatory properties mi...

Full description

Bibliographic Details
Main Authors: Majid Masso, Nitin Rao, Purnima Pyarasani
Format: Article
Language:English
Published: PeerJ Inc. 2018-05-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/4844.pdf
id doaj-3964afe5e25f423dad50b87ff220be9d
record_format Article
spelling doaj-3964afe5e25f423dad50b87ff220be9d2020-11-25T01:19:25ZengPeerJ Inc.PeerJ2167-83592018-05-016e484410.7717/peerj.4844Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesisMajid Masso0Nitin Rao1Purnima Pyarasani2Laboratory for Structural Bioinformatics, School of Systems Biology, George Mason University, Manassas, VA, United States of AmericaLaboratory for Structural Bioinformatics, School of Systems Biology, George Mason University, Manassas, VA, United States of AmericaLaboratory for Structural Bioinformatics, School of Systems Biology, George Mason University, Manassas, VA, United States of AmericaAs a DNA binding transcriptional activator, Gal4 promotes the expression of genes responsible for galactose metabolism. The Gal4 protein from Saccharomyces cerevisiae (baker’s yeast) has become a model for studying eukaryotic transcriptional activation in general because its regulatory properties mirror those of several eukaryotic organisms, including mammals. Given the availability of a crystallographic structure for Gal4, here we implement an in silico mutagenesis technique that makes use of a four-body knowledge-based energy function, in order to empirically quantify the structural impacts associated with single residue substitutions on the Gal4 protein. These results were used to examine the structure-function relationship in Gal4 based on a recently published experimental mutagenesis study, whereby functional changes to a uniformly distributed set of 1,068 single residue Gal4 variants were obtained by measuring their transcriptional activation levels relative to wild-type. A significant correlation was observed between computed (scalar) structural effect data and measured activity values for this collection of single residue Gal4 variants. Additionally, attribute vectors quantifying position-specific environmental impacts were generated for each of the Gal4 variants via computational mutagenesis, and we implemented supervised classification and regression statistical machine learning algorithms to train predictive models of variant Gal4 activity based on these structural changes. All models performed well under cross-validation testing, with balanced accuracy reaching 91% among the classification models, and with the actual and predicted activity values displaying a correlation as high as r = 0.80 for the regression models. Reliable predictions of transcriptional activation levels for Gal4 variants that have yet to be studied can be instantly generated by submitting their respective structure-based feature vectors to the trained models for testing. Such a computational pre-screening of Gal4 variants may potentially reduce costs associated with running large-scale mutagenesis experiments.https://peerj.com/articles/4844.pdfKnowledge-based potentialVariant function predictionComputational mutagenesisStructure–function relationshipsMachine learningGal4
collection DOAJ
language English
format Article
sources DOAJ
author Majid Masso
Nitin Rao
Purnima Pyarasani
spellingShingle Majid Masso
Nitin Rao
Purnima Pyarasani
Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
PeerJ
Knowledge-based potential
Variant function prediction
Computational mutagenesis
Structure–function relationships
Machine learning
Gal4
author_facet Majid Masso
Nitin Rao
Purnima Pyarasani
author_sort Majid Masso
title Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
title_short Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
title_full Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
title_fullStr Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
title_full_unstemmed Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
title_sort modeling transcriptional activation changes to gal4 variants via structure-based computational mutagenesis
publisher PeerJ Inc.
series PeerJ
issn 2167-8359
publishDate 2018-05-01
description As a DNA binding transcriptional activator, Gal4 promotes the expression of genes responsible for galactose metabolism. The Gal4 protein from Saccharomyces cerevisiae (baker’s yeast) has become a model for studying eukaryotic transcriptional activation in general because its regulatory properties mirror those of several eukaryotic organisms, including mammals. Given the availability of a crystallographic structure for Gal4, here we implement an in silico mutagenesis technique that makes use of a four-body knowledge-based energy function, in order to empirically quantify the structural impacts associated with single residue substitutions on the Gal4 protein. These results were used to examine the structure-function relationship in Gal4 based on a recently published experimental mutagenesis study, whereby functional changes to a uniformly distributed set of 1,068 single residue Gal4 variants were obtained by measuring their transcriptional activation levels relative to wild-type. A significant correlation was observed between computed (scalar) structural effect data and measured activity values for this collection of single residue Gal4 variants. Additionally, attribute vectors quantifying position-specific environmental impacts were generated for each of the Gal4 variants via computational mutagenesis, and we implemented supervised classification and regression statistical machine learning algorithms to train predictive models of variant Gal4 activity based on these structural changes. All models performed well under cross-validation testing, with balanced accuracy reaching 91% among the classification models, and with the actual and predicted activity values displaying a correlation as high as r = 0.80 for the regression models. Reliable predictions of transcriptional activation levels for Gal4 variants that have yet to be studied can be instantly generated by submitting their respective structure-based feature vectors to the trained models for testing. Such a computational pre-screening of Gal4 variants may potentially reduce costs associated with running large-scale mutagenesis experiments.
topic Knowledge-based potential
Variant function prediction
Computational mutagenesis
Structure–function relationships
Machine learning
Gal4
url https://peerj.com/articles/4844.pdf
work_keys_str_mv AT majidmasso modelingtranscriptionalactivationchangestogal4variantsviastructurebasedcomputationalmutagenesis
AT nitinrao modelingtranscriptionalactivationchangestogal4variantsviastructurebasedcomputationalmutagenesis
AT purnimapyarasani modelingtranscriptionalactivationchangestogal4variantsviastructurebasedcomputationalmutagenesis
_version_ 1725138340854890496