Guaranteed Diversity and Optimality in Cost Function Network Based Computational Protein Design Methods

Proteins are the main active molecules of life. Although natural proteins play many roles, as enzymes or antibodies for example, there is a need to go beyond the repertoire of natural proteins to produce engineered proteins that precisely meet application requirements, in terms of function, stabilit...

Full description

Bibliographic Details
Main Authors: Manon Ruffini, Jelena Vucinic, Simon de Givry, George Katsirelos, Sophie Barbe, Thomas Schiex
Format: Article
Language:English
Published: MDPI AG 2021-05-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/14/6/168
id doaj-fb28dc2e2fe6464599d5a0e277b74994
record_format Article
spelling doaj-fb28dc2e2fe6464599d5a0e277b749942021-06-01T01:26:27ZengMDPI AGAlgorithms1999-48932021-05-011416816810.3390/a14060168Guaranteed Diversity and Optimality in Cost Function Network Based Computational Protein Design MethodsManon Ruffini0Jelena Vucinic1Simon de Givry2George Katsirelos3Sophie Barbe4Thomas Schiex5Université Fédérale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, FranceTBI, Université de Toulouse, CNRS, INRAE, INSA, ANITI, 31077 Toulouse, FranceUniversité Fédérale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, FranceMIA-Paris-Mathématiques et Informatique Appliquées, INRAE, 75231 Paris, FranceTBI, Université de Toulouse, CNRS, INRAE, INSA, ANITI, 31077 Toulouse, FranceUniversité Fédérale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, FranceProteins are the main active molecules of life. Although natural proteins play many roles, as enzymes or antibodies for example, there is a need to go beyond the repertoire of natural proteins to produce engineered proteins that precisely meet application requirements, in terms of function, stability, activity or other protein capacities. Computational Protein Design aims at designing new proteins from first principles, using full-atom molecular models. However, the size and complexity of proteins require approximations to make them amenable to energetic optimization queries. These approximations make the design process less reliable, and a provable optimal solution may fail. In practice, expensive libraries of solutions are therefore generated and tested. In this paper, we explore the idea of generating libraries of provably diverse low-energy solutions by extending cost function network algorithms with dedicated automaton-based diversity constraints on a large set of realistic full protein redesign problems. We observe that it is possible to generate provably diverse libraries in reasonable time and that the produced libraries do enhance the Native Sequence Recovery, a traditional measure of design methods reliability.https://www.mdpi.com/1999-4893/14/6/168computational protein designgraphical modelsautomatacost function networksstructural biologydiversity
collection DOAJ
language English
format Article
sources DOAJ
author Manon Ruffini
Jelena Vucinic
Simon de Givry
George Katsirelos
Sophie Barbe
Thomas Schiex
spellingShingle Manon Ruffini
Jelena Vucinic
Simon de Givry
George Katsirelos
Sophie Barbe
Thomas Schiex
Guaranteed Diversity and Optimality in Cost Function Network Based Computational Protein Design Methods
Algorithms
computational protein design
graphical models
automata
cost function networks
structural biology
diversity
author_facet Manon Ruffini
Jelena Vucinic
Simon de Givry
George Katsirelos
Sophie Barbe
Thomas Schiex
author_sort Manon Ruffini
title Guaranteed Diversity and Optimality in Cost Function Network Based Computational Protein Design Methods
title_short Guaranteed Diversity and Optimality in Cost Function Network Based Computational Protein Design Methods
title_full Guaranteed Diversity and Optimality in Cost Function Network Based Computational Protein Design Methods
title_fullStr Guaranteed Diversity and Optimality in Cost Function Network Based Computational Protein Design Methods
title_full_unstemmed Guaranteed Diversity and Optimality in Cost Function Network Based Computational Protein Design Methods
title_sort guaranteed diversity and optimality in cost function network based computational protein design methods
publisher MDPI AG
series Algorithms
issn 1999-4893
publishDate 2021-05-01
description Proteins are the main active molecules of life. Although natural proteins play many roles, as enzymes or antibodies for example, there is a need to go beyond the repertoire of natural proteins to produce engineered proteins that precisely meet application requirements, in terms of function, stability, activity or other protein capacities. Computational Protein Design aims at designing new proteins from first principles, using full-atom molecular models. However, the size and complexity of proteins require approximations to make them amenable to energetic optimization queries. These approximations make the design process less reliable, and a provable optimal solution may fail. In practice, expensive libraries of solutions are therefore generated and tested. In this paper, we explore the idea of generating libraries of provably diverse low-energy solutions by extending cost function network algorithms with dedicated automaton-based diversity constraints on a large set of realistic full protein redesign problems. We observe that it is possible to generate provably diverse libraries in reasonable time and that the produced libraries do enhance the Native Sequence Recovery, a traditional measure of design methods reliability.
topic computational protein design
graphical models
automata
cost function networks
structural biology
diversity
url https://www.mdpi.com/1999-4893/14/6/168
work_keys_str_mv AT manonruffini guaranteeddiversityandoptimalityincostfunctionnetworkbasedcomputationalproteindesignmethods
AT jelenavucinic guaranteeddiversityandoptimalityincostfunctionnetworkbasedcomputationalproteindesignmethods
AT simondegivry guaranteeddiversityandoptimalityincostfunctionnetworkbasedcomputationalproteindesignmethods
AT georgekatsirelos guaranteeddiversityandoptimalityincostfunctionnetworkbasedcomputationalproteindesignmethods
AT sophiebarbe guaranteeddiversityandoptimalityincostfunctionnetworkbasedcomputationalproteindesignmethods
AT thomasschiex guaranteeddiversityandoptimalityincostfunctionnetworkbasedcomputationalproteindesignmethods
_version_ 1721412396200755200