Fast protein structure comparison through effective representation learning with contrastive graph neural networks

Protein structure alignment algorithms are often time-consuming, resulting in challenges for large-scale protein structure similarity-based retrieval. There is an urgent need for more efficient structure comparison approaches as the number of protein structures increases rapidly. In this paper, we p...

Full description

Bibliographic Details
Main Authors:	Feng, S.-H (Author), Pan, X. (Author), Shen, H.-B (Author), Xia, C. (Author), Xia, Y. (Author)
Format:	Article
Language:	English
Published:	Public Library of Science 2022
Subjects:	algorithm Algorithms article feature learning (machine learning) learning Learning Neural Networks, Computer performance indicator protein protein structure protein tertiary structure Proteins software Software
Online Access:	View Fulltext in Publisher


LEADER	02777nam a2200349Ia 4500
001	10.1371-journal.pcbi.1009986
008	220425s2022 CNT 000 0 und d
020			\|a 1553734X (ISSN)
245	1	0	\|a Fast protein structure comparison through effective representation learning with contrastive graph neural networks
260		0	\|b Public Library of Science \|c 2022
856			\|z View Fulltext in Publisher \|u https://doi.org/10.1371/journal.pcbi.1009986
520	3		\|a Protein structure alignment algorithms are often time-consuming, resulting in challenges for large-scale protein structure similarity-based retrieval. There is an urgent need for more efficient structure comparison approaches as the number of protein structures increases rapidly. In this paper, we propose an effective graph-based protein structure representation learning method, GraSR, for fast and accurate structure comparison. In GraSR, a graph is constructed based on the intra-residue distance derived from the tertiary structure. Then, deep graph neural networks (GNNs) with a short-cut connection learn graph representations of the tertiary structures under a contrastive learning framework. To further improve GraSR, a novel dynamic training data partition strategy and length-scaling cosine distance are introduced. We objectively evaluate our method GraSR on SCOPe v2.07 and a new released independent test set from PDB database with a designed comprehensive performance metric. Compared with other state-of-the-art methods, GraSR achieves about 7%-10% improvement on two benchmark datasets. GraSR is also much faster than alignment-based methods. We dig into the model and observe that the superiority of GraSR is mainly brought by the learned discriminative residue-level and global descriptors. The web-server and source code of GraSR are freely available at www.csbio.sjtu.edu.cn/bioinf/GraSR/ for academic use. Copyright: © 2022 Xia et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
650	0	4	\|a algorithm
650	0	4	\|a Algorithms
650	0	4	\|a article
650	0	4	\|a feature learning (machine learning)
650	0	4	\|a learning
650	0	4	\|a Learning
650	0	4	\|a Neural Networks, Computer
650	0	4	\|a performance indicator
650	0	4	\|a protein
650	0	4	\|a protein structure
650	0	4	\|a protein tertiary structure
650	0	4	\|a Proteins
650	0	4	\|a software
650	0	4	\|a Software
700	1		\|a Feng, S.-H. \|e author
700	1		\|a Pan, X. \|e author
700	1		\|a Shen, H.-B. \|e author
700	1		\|a Xia, C. \|e author
700	1		\|a Xia, Y. \|e author
773			\|t PLoS Computational Biology

Fast protein structure comparison through effective representation learning with contrastive graph neural networks

Similar Items