Guaranteed Diversity and Optimality in Cost Function Network Based Computational Protein Design Methods

Proteins are the main active molecules of life. Although natural proteins play many roles, as enzymes or antibodies for example, there is a need to go beyond the repertoire of natural proteins to produce engineered proteins that precisely meet application requirements, in terms of function, stabilit...

Full description

Bibliographic Details
Main Authors: Manon Ruffini, Jelena Vucinic, Simon de Givry, George Katsirelos, Sophie Barbe, Thomas Schiex
Format: Article
Language:English
Published: MDPI AG 2021-05-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/14/6/168
Description
Summary:Proteins are the main active molecules of life. Although natural proteins play many roles, as enzymes or antibodies for example, there is a need to go beyond the repertoire of natural proteins to produce engineered proteins that precisely meet application requirements, in terms of function, stability, activity or other protein capacities. Computational Protein Design aims at designing new proteins from first principles, using full-atom molecular models. However, the size and complexity of proteins require approximations to make them amenable to energetic optimization queries. These approximations make the design process less reliable, and a provable optimal solution may fail. In practice, expensive libraries of solutions are therefore generated and tested. In this paper, we explore the idea of generating libraries of provably diverse low-energy solutions by extending cost function network algorithms with dedicated automaton-based diversity constraints on a large set of realistic full protein redesign problems. We observe that it is possible to generate provably diverse libraries in reasonable time and that the produced libraries do enhance the Native Sequence Recovery, a traditional measure of design methods reliability.
ISSN:1999-4893