Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off

The recent breakthrough in the field of protein structure prediction shows the relevance of using knowledge-based based scoring functions in combination with a low-resolution 3D representation of protein macromolecules. The choice of not using all atoms is barely supported by any data in the literat...

Full description

Bibliographic Details
Main Authors: Guillaume Postic, Nathalie Janel, Gautier Moroy
Format: Article
Language:English
Published: Elsevier 2021-01-01
Series:Computational and Structural Biotechnology Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037021001616
id doaj-5601594ca1714103b2f5612d020162d1
record_format Article
spelling doaj-5601594ca1714103b2f5612d020162d12021-05-08T04:22:23ZengElsevierComputational and Structural Biotechnology Journal2001-03702021-01-011926182625Representations of protein structure for exploring the conformational space: A speed–accuracy trade-offGuillaume Postic0Nathalie Janel1Gautier Moroy2Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, F-75013 Paris, France; Corresponding author.Université de Paris, BFA, UMR 8251, CNRS, F-75013 Paris, FranceUniversité de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, F-75013 Paris, FranceThe recent breakthrough in the field of protein structure prediction shows the relevance of using knowledge-based based scoring functions in combination with a low-resolution 3D representation of protein macromolecules. The choice of not using all atoms is barely supported by any data in the literature, and is mostly motivated by empirical and practical reasons, such as the computational cost of assessing the numerous folds of the protein conformational space. Here, we present a comprehensive study, carried on a large and balanced benchmark of predicted protein structures, to see how different types of structural representations rank in either accuracy or calculation speed, and which ones offer the best compromise between these two criteria. We tested ten representations, including low-resolution, high-resolution, and coarse-grained approaches. We also investigated the generalization of the findings to other formalisms than the widely-used “potential of mean force” (PMF) method. Thus, we observed that representing protein structures by their β carbons—combined or not with Cα—provides the best speed–accuracy trade-off, when using a “total information gain” scoring function. For statistical PMFs, using MARTINI backbone and side-chains beads is the best option. Finally, we also demonstrated the necessity of training the reference state on all atom types, and of including the Cα atoms of glycine residues, in a Cβ-based representation.http://www.sciencedirect.com/science/article/pii/S2001037021001616Protein structure predictionStatistical potentialsCoarse-grained modelsProtein foldingLow-resolution representation
collection DOAJ
language English
format Article
sources DOAJ
author Guillaume Postic
Nathalie Janel
Gautier Moroy
spellingShingle Guillaume Postic
Nathalie Janel
Gautier Moroy
Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
Computational and Structural Biotechnology Journal
Protein structure prediction
Statistical potentials
Coarse-grained models
Protein folding
Low-resolution representation
author_facet Guillaume Postic
Nathalie Janel
Gautier Moroy
author_sort Guillaume Postic
title Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
title_short Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
title_full Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
title_fullStr Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
title_full_unstemmed Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
title_sort representations of protein structure for exploring the conformational space: a speed–accuracy trade-off
publisher Elsevier
series Computational and Structural Biotechnology Journal
issn 2001-0370
publishDate 2021-01-01
description The recent breakthrough in the field of protein structure prediction shows the relevance of using knowledge-based based scoring functions in combination with a low-resolution 3D representation of protein macromolecules. The choice of not using all atoms is barely supported by any data in the literature, and is mostly motivated by empirical and practical reasons, such as the computational cost of assessing the numerous folds of the protein conformational space. Here, we present a comprehensive study, carried on a large and balanced benchmark of predicted protein structures, to see how different types of structural representations rank in either accuracy or calculation speed, and which ones offer the best compromise between these two criteria. We tested ten representations, including low-resolution, high-resolution, and coarse-grained approaches. We also investigated the generalization of the findings to other formalisms than the widely-used “potential of mean force” (PMF) method. Thus, we observed that representing protein structures by their β carbons—combined or not with Cα—provides the best speed–accuracy trade-off, when using a “total information gain” scoring function. For statistical PMFs, using MARTINI backbone and side-chains beads is the best option. Finally, we also demonstrated the necessity of training the reference state on all atom types, and of including the Cα atoms of glycine residues, in a Cβ-based representation.
topic Protein structure prediction
Statistical potentials
Coarse-grained models
Protein folding
Low-resolution representation
url http://www.sciencedirect.com/science/article/pii/S2001037021001616
work_keys_str_mv AT guillaumepostic representationsofproteinstructureforexploringtheconformationalspaceaspeedaccuracytradeoff
AT nathaliejanel representationsofproteinstructureforexploringtheconformationalspaceaspeedaccuracytradeoff
AT gautiermoroy representationsofproteinstructureforexploringtheconformationalspaceaspeedaccuracytradeoff
_version_ 1721455144741109760