Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity

Effective prediction of protein tertiary structure from sequence is an important and challenging problem in computational structural biology. Ab initio protein structure prediction is based on amino acid sequence alone, thus, it has a wide application area. With the ab initio method, a large number...

Full description

Bibliographic Details
Main Authors:	Xu Han, Li Li, Yonggang Lu
Format:	Article
Language:	English
Published:	MDPI AG 2019-02-01
Series:	Genes
Subjects:	GR_score dynamic programming gap penalty near-native protein protein structure prediction
Online Access:	https://www.mdpi.com/2073-4425/10/2/132

id	doaj-730d748de4554db2bc7bb1597402a761
record_format	Article
spelling	doaj-730d748de4554db2bc7bb1597402a7612020-11-24T23:30:54ZengMDPI AGGenes2073-44252019-02-0110213210.3390/genes10020132genes10020132Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree SimilarityXu Han0Li Li1Yonggang Lu2School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, ChinaSchool of Information Science and Engineering, Lanzhou University, Lanzhou 730000, ChinaSchool of Information Science and Engineering, Lanzhou University, Lanzhou 730000, ChinaEffective prediction of protein tertiary structure from sequence is an important and challenging problem in computational structural biology. Ab initio protein structure prediction is based on amino acid sequence alone, thus, it has a wide application area. With the ab initio method, a large number of candidate protein structures called decoy set can be predicted, however, it is a difficult problem to select a good near-native structure from the predicted decoy set. In this work we propose a new method for selecting the near-native structure from the decoy set based on both contact map overlap (CMO) and graphlets. By generalizing graphlets to ordered graphs, and using a dynamic programming to select the optimal alignment with an introduced gap penalty, a GR_score is defined for calculating the similarity between the three-dimensional (3D) decoy structures. The proposed method was applied to all 54 single-domain targets in CASP11 and all 43 targets in CASP10, and ensemble clustering was used to cluster the protein decoy structures based on the computed CR_scores. The most popular centroid structure was selected as the near-native structure. The experiments showed that compared to the SPICKER method, which is used in I-TASSER, the proposed method can usually select better near-native structures in terms of the similarity between the selected structure and the true native structure.https://www.mdpi.com/2073-4425/10/2/132GR_scoredynamic programminggap penaltynear-native proteinprotein structure prediction
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Xu Han Li Li Yonggang Lu
spellingShingle	Xu Han Li Li Yonggang Lu Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity Genes GR_score dynamic programming gap penalty near-native protein protein structure prediction
author_facet	Xu Han Li Li Yonggang Lu
author_sort	Xu Han
title	Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity
title_short	Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity
title_full	Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity
title_fullStr	Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity
title_full_unstemmed	Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity
title_sort	selecting near-native protein structures from predicted decoy sets using ordered graphlet degree similarity
publisher	MDPI AG
series	Genes
issn	2073-4425
publishDate	2019-02-01
description	Effective prediction of protein tertiary structure from sequence is an important and challenging problem in computational structural biology. Ab initio protein structure prediction is based on amino acid sequence alone, thus, it has a wide application area. With the ab initio method, a large number of candidate protein structures called decoy set can be predicted, however, it is a difficult problem to select a good near-native structure from the predicted decoy set. In this work we propose a new method for selecting the near-native structure from the decoy set based on both contact map overlap (CMO) and graphlets. By generalizing graphlets to ordered graphs, and using a dynamic programming to select the optimal alignment with an introduced gap penalty, a GR_score is defined for calculating the similarity between the three-dimensional (3D) decoy structures. The proposed method was applied to all 54 single-domain targets in CASP11 and all 43 targets in CASP10, and ensemble clustering was used to cluster the protein decoy structures based on the computed CR_scores. The most popular centroid structure was selected as the near-native structure. The experiments showed that compared to the SPICKER method, which is used in I-TASSER, the proposed method can usually select better near-native structures in terms of the similarity between the selected structure and the true native structure.
topic	GR_score dynamic programming gap penalty near-native protein protein structure prediction
url	https://www.mdpi.com/2073-4425/10/2/132
work_keys_str_mv	AT xuhan selectingnearnativeproteinstructuresfrompredicteddecoysetsusingorderedgraphletdegreesimilarity AT lili selectingnearnativeproteinstructuresfrompredicteddecoysetsusingorderedgraphletdegreesimilarity AT yongganglu selectingnearnativeproteinstructuresfrompredicteddecoysetsusingorderedgraphletdegreesimilarity
_version_	1725539763677560832

Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity

Similar Items