Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms [v3; ref status: indexed, http://f1000r.es/2kg]
Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
F1000 Research Ltd
2013-12-01
|
Series: | F1000Research |
Subjects: | |
Online Access: | http://f1000research.com/articles/2-211/v3 |
id |
doaj-43868bf7768c4b4fb06efd42891ac2d1 |
---|---|
record_format |
Article |
spelling |
doaj-43868bf7768c4b4fb06efd42891ac2d12020-11-25T02:53:51ZengF1000 Research LtdF1000Research2046-14022013-12-01210.12688/f1000research.2-211.v33328Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms [v3; ref status: indexed, http://f1000r.es/2kg]Sandeep Chakraborty0Ravindra Venkatramani1Basuthkar J. Rao2Bjarni Asgeirsson3Abhaya M. Dandekar4Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, IndiaDepartment of Chemical Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, IndiaDepartment of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, IndiaScience Institute, Department of Biochemistry, University of Iceland, Reykjavik, IS-107, IcelandPlant Sciences Department, University of California, Davis, CA 95616, USAPredicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP) validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC) atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database), is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.http://f1000research.com/articles/2-211/v3Experimental Biophysical MethodsProtein FoldingStructural GenomicsTheory & Simulation |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Sandeep Chakraborty Ravindra Venkatramani Basuthkar J. Rao Bjarni Asgeirsson Abhaya M. Dandekar |
spellingShingle |
Sandeep Chakraborty Ravindra Venkatramani Basuthkar J. Rao Bjarni Asgeirsson Abhaya M. Dandekar Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms [v3; ref status: indexed, http://f1000r.es/2kg] F1000Research Experimental Biophysical Methods Protein Folding Structural Genomics Theory & Simulation |
author_facet |
Sandeep Chakraborty Ravindra Venkatramani Basuthkar J. Rao Bjarni Asgeirsson Abhaya M. Dandekar |
author_sort |
Sandeep Chakraborty |
title |
Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms [v3; ref status: indexed, http://f1000r.es/2kg] |
title_short |
Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms [v3; ref status: indexed, http://f1000r.es/2kg] |
title_full |
Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms [v3; ref status: indexed, http://f1000r.es/2kg] |
title_fullStr |
Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms [v3; ref status: indexed, http://f1000r.es/2kg] |
title_full_unstemmed |
Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms [v3; ref status: indexed, http://f1000r.es/2kg] |
title_sort |
protein structure quality assessment based on the distance profiles of consecutive backbone cα atoms [v3; ref status: indexed, http://f1000r.es/2kg] |
publisher |
F1000 Research Ltd |
series |
F1000Research |
issn |
2046-1402 |
publishDate |
2013-12-01 |
description |
Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP) validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC) atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database), is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134. |
topic |
Experimental Biophysical Methods Protein Folding Structural Genomics Theory & Simulation |
url |
http://f1000research.com/articles/2-211/v3 |
work_keys_str_mv |
AT sandeepchakraborty proteinstructurequalityassessmentbasedonthedistanceprofilesofconsecutivebackbonecaatomsv3refstatusindexedhttpf1000res2kg AT ravindravenkatramani proteinstructurequalityassessmentbasedonthedistanceprofilesofconsecutivebackbonecaatomsv3refstatusindexedhttpf1000res2kg AT basuthkarjrao proteinstructurequalityassessmentbasedonthedistanceprofilesofconsecutivebackbonecaatomsv3refstatusindexedhttpf1000res2kg AT bjarniasgeirsson proteinstructurequalityassessmentbasedonthedistanceprofilesofconsecutivebackbonecaatomsv3refstatusindexedhttpf1000res2kg AT abhayamdandekar proteinstructurequalityassessmentbasedonthedistanceprofilesofconsecutivebackbonecaatomsv3refstatusindexedhttpf1000res2kg |
_version_ |
1724724088754143232 |