Inferring a complete genotype-phenotype map from a small number of measured phenotypes.
Understanding evolution requires detailed knowledge of genotype-phenotype maps; however, it can be a herculean task to measure every phenotype in a combinatorial map. We have developed a computational strategy to predict the missing phenotypes from an incomplete, combinatorial genotype-phenotype map...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2020-09-01
|
Series: | PLoS Computational Biology |
Online Access: | https://doi.org/10.1371/journal.pcbi.1008243 |
id |
doaj-020e794cc46b429399bb79a85a12ccc8 |
---|---|
record_format |
Article |
spelling |
doaj-020e794cc46b429399bb79a85a12ccc82021-04-21T15:18:03ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582020-09-01169e100824310.1371/journal.pcbi.1008243Inferring a complete genotype-phenotype map from a small number of measured phenotypes.Zachary R SailerSarah H ShafikRobert L SummersAlex JouleAlice Patterson-RobertRowena E MartinMichael J HarmsUnderstanding evolution requires detailed knowledge of genotype-phenotype maps; however, it can be a herculean task to measure every phenotype in a combinatorial map. We have developed a computational strategy to predict the missing phenotypes from an incomplete, combinatorial genotype-phenotype map. As a test case, we used an incomplete genotype-phenotype dataset previously generated for the malaria parasite's 'chloroquine resistance transporter' (PfCRT). Wild-type PfCRT (PfCRT3D7) lacks significant chloroquine (CQ) transport activity, but the introduction of the eight mutations present in the 'Dd2' isoform of PfCRT (PfCRTDd2) enables the protein to transport CQ away from its site of antimalarial action. This gain of a transport function imparts CQ resistance to the parasite. A combinatorial map between PfCRT3D7 and PfCRTDd2 consists of 256 genotypes, of which only 52 have had their CQ transport activities measured through expression in the Xenopus laevis oocyte. We trained a statistical model with these 52 measurements to infer the CQ transport activity for the remaining 204 combinatorial genotypes between PfCRT3D7 and PfCRTDd2. Our best-performing model incorporated a binary classifier, a nonlinear scale, and additive effects for each mutation. The addition of specific pairwise- and high-order-epistatic coefficients decreased the predictive power of the model. We evaluated our predictions by experimentally measuring the CQ transport activities of 24 additional PfCRT genotypes. The R2 value between our predicted and newly-measured phenotypes was 0.90. We then used the model to probe the accessibility of evolutionary trajectories through the map. Approximately 1% of the possible trajectories between PfCRT3D7 and PfCRTDd2 are accessible; however, none of the trajectories entailed eight successive increases in CQ transport activity. These results demonstrate that phenotypes can be inferred with known uncertainty from a partial genotype-phenotype dataset. We also validated our approach against a collection of previously published genotype-phenotype maps. The model therefore appears general and should be applicable to a large number of genotype-phenotype maps.https://doi.org/10.1371/journal.pcbi.1008243 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Zachary R Sailer Sarah H Shafik Robert L Summers Alex Joule Alice Patterson-Robert Rowena E Martin Michael J Harms |
spellingShingle |
Zachary R Sailer Sarah H Shafik Robert L Summers Alex Joule Alice Patterson-Robert Rowena E Martin Michael J Harms Inferring a complete genotype-phenotype map from a small number of measured phenotypes. PLoS Computational Biology |
author_facet |
Zachary R Sailer Sarah H Shafik Robert L Summers Alex Joule Alice Patterson-Robert Rowena E Martin Michael J Harms |
author_sort |
Zachary R Sailer |
title |
Inferring a complete genotype-phenotype map from a small number of measured phenotypes. |
title_short |
Inferring a complete genotype-phenotype map from a small number of measured phenotypes. |
title_full |
Inferring a complete genotype-phenotype map from a small number of measured phenotypes. |
title_fullStr |
Inferring a complete genotype-phenotype map from a small number of measured phenotypes. |
title_full_unstemmed |
Inferring a complete genotype-phenotype map from a small number of measured phenotypes. |
title_sort |
inferring a complete genotype-phenotype map from a small number of measured phenotypes. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS Computational Biology |
issn |
1553-734X 1553-7358 |
publishDate |
2020-09-01 |
description |
Understanding evolution requires detailed knowledge of genotype-phenotype maps; however, it can be a herculean task to measure every phenotype in a combinatorial map. We have developed a computational strategy to predict the missing phenotypes from an incomplete, combinatorial genotype-phenotype map. As a test case, we used an incomplete genotype-phenotype dataset previously generated for the malaria parasite's 'chloroquine resistance transporter' (PfCRT). Wild-type PfCRT (PfCRT3D7) lacks significant chloroquine (CQ) transport activity, but the introduction of the eight mutations present in the 'Dd2' isoform of PfCRT (PfCRTDd2) enables the protein to transport CQ away from its site of antimalarial action. This gain of a transport function imparts CQ resistance to the parasite. A combinatorial map between PfCRT3D7 and PfCRTDd2 consists of 256 genotypes, of which only 52 have had their CQ transport activities measured through expression in the Xenopus laevis oocyte. We trained a statistical model with these 52 measurements to infer the CQ transport activity for the remaining 204 combinatorial genotypes between PfCRT3D7 and PfCRTDd2. Our best-performing model incorporated a binary classifier, a nonlinear scale, and additive effects for each mutation. The addition of specific pairwise- and high-order-epistatic coefficients decreased the predictive power of the model. We evaluated our predictions by experimentally measuring the CQ transport activities of 24 additional PfCRT genotypes. The R2 value between our predicted and newly-measured phenotypes was 0.90. We then used the model to probe the accessibility of evolutionary trajectories through the map. Approximately 1% of the possible trajectories between PfCRT3D7 and PfCRTDd2 are accessible; however, none of the trajectories entailed eight successive increases in CQ transport activity. These results demonstrate that phenotypes can be inferred with known uncertainty from a partial genotype-phenotype dataset. We also validated our approach against a collection of previously published genotype-phenotype maps. The model therefore appears general and should be applicable to a large number of genotype-phenotype maps. |
url |
https://doi.org/10.1371/journal.pcbi.1008243 |
work_keys_str_mv |
AT zacharyrsailer inferringacompletegenotypephenotypemapfromasmallnumberofmeasuredphenotypes AT sarahhshafik inferringacompletegenotypephenotypemapfromasmallnumberofmeasuredphenotypes AT robertlsummers inferringacompletegenotypephenotypemapfromasmallnumberofmeasuredphenotypes AT alexjoule inferringacompletegenotypephenotypemapfromasmallnumberofmeasuredphenotypes AT alicepattersonrobert inferringacompletegenotypephenotypemapfromasmallnumberofmeasuredphenotypes AT rowenaemartin inferringacompletegenotypephenotypemapfromasmallnumberofmeasuredphenotypes AT michaeljharms inferringacompletegenotypephenotypemapfromasmallnumberofmeasuredphenotypes |
_version_ |
1714667472170254336 |