Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity
Effective prediction of protein tertiary structure from sequence is an important and challenging problem in computational structural biology. Ab initio protein structure prediction is based on amino acid sequence alone, thus, it has a wide application area. With the ab initio method, a large number...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2019-02-01
|
Series: | Genes |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-4425/10/2/132 |
id |
doaj-730d748de4554db2bc7bb1597402a761 |
---|---|
record_format |
Article |
spelling |
doaj-730d748de4554db2bc7bb1597402a7612020-11-24T23:30:54ZengMDPI AGGenes2073-44252019-02-0110213210.3390/genes10020132genes10020132Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree SimilarityXu Han0Li Li1Yonggang Lu2School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, ChinaSchool of Information Science and Engineering, Lanzhou University, Lanzhou 730000, ChinaSchool of Information Science and Engineering, Lanzhou University, Lanzhou 730000, ChinaEffective prediction of protein tertiary structure from sequence is an important and challenging problem in computational structural biology. Ab initio protein structure prediction is based on amino acid sequence alone, thus, it has a wide application area. With the ab initio method, a large number of candidate protein structures called decoy set can be predicted, however, it is a difficult problem to select a good near-native structure from the predicted decoy set. In this work we propose a new method for selecting the near-native structure from the decoy set based on both contact map overlap (CMO) and graphlets. By generalizing graphlets to ordered graphs, and using a dynamic programming to select the optimal alignment with an introduced gap penalty, a GR_score is defined for calculating the similarity between the three-dimensional (3D) decoy structures. The proposed method was applied to all 54 single-domain targets in CASP11 and all 43 targets in CASP10, and ensemble clustering was used to cluster the protein decoy structures based on the computed CR_scores. The most popular centroid structure was selected as the near-native structure. The experiments showed that compared to the SPICKER method, which is used in I-TASSER, the proposed method can usually select better near-native structures in terms of the similarity between the selected structure and the true native structure.https://www.mdpi.com/2073-4425/10/2/132GR_scoredynamic programminggap penaltynear-native proteinprotein structure prediction |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xu Han Li Li Yonggang Lu |
spellingShingle |
Xu Han Li Li Yonggang Lu Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity Genes GR_score dynamic programming gap penalty near-native protein protein structure prediction |
author_facet |
Xu Han Li Li Yonggang Lu |
author_sort |
Xu Han |
title |
Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity |
title_short |
Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity |
title_full |
Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity |
title_fullStr |
Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity |
title_full_unstemmed |
Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity |
title_sort |
selecting near-native protein structures from predicted decoy sets using ordered graphlet degree similarity |
publisher |
MDPI AG |
series |
Genes |
issn |
2073-4425 |
publishDate |
2019-02-01 |
description |
Effective prediction of protein tertiary structure from sequence is an important and challenging problem in computational structural biology. Ab initio protein structure prediction is based on amino acid sequence alone, thus, it has a wide application area. With the ab initio method, a large number of candidate protein structures called decoy set can be predicted, however, it is a difficult problem to select a good near-native structure from the predicted decoy set. In this work we propose a new method for selecting the near-native structure from the decoy set based on both contact map overlap (CMO) and graphlets. By generalizing graphlets to ordered graphs, and using a dynamic programming to select the optimal alignment with an introduced gap penalty, a GR_score is defined for calculating the similarity between the three-dimensional (3D) decoy structures. The proposed method was applied to all 54 single-domain targets in CASP11 and all 43 targets in CASP10, and ensemble clustering was used to cluster the protein decoy structures based on the computed CR_scores. The most popular centroid structure was selected as the near-native structure. The experiments showed that compared to the SPICKER method, which is used in I-TASSER, the proposed method can usually select better near-native structures in terms of the similarity between the selected structure and the true native structure. |
topic |
GR_score dynamic programming gap penalty near-native protein protein structure prediction |
url |
https://www.mdpi.com/2073-4425/10/2/132 |
work_keys_str_mv |
AT xuhan selectingnearnativeproteinstructuresfrompredicteddecoysetsusingorderedgraphletdegreesimilarity AT lili selectingnearnativeproteinstructuresfrompredicteddecoysetsusingorderedgraphletdegreesimilarity AT yongganglu selectingnearnativeproteinstructuresfrompredicteddecoysetsusingorderedgraphletdegreesimilarity |
_version_ |
1725539763677560832 |