Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning

Interactions between transmembrane (TM) proteins are fundamental for a wide spectrum of cellular functions, but precise molecular details of these interactions remain largely unknown due to the scarcity of experimentally determined three-dimensional complex structures. Computational techniques are t...

Full description

Bibliographic Details
Main Authors: Jianfeng Sun, Dmitrij Frishman
Format: Article
Language:English
Published: Elsevier 2021-01-01
Series:Computational and Structural Biotechnology Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037021000775
id doaj-4e0b6eb9abbc412c91ed739621cc8b80
record_format Article
spelling doaj-4e0b6eb9abbc412c91ed739621cc8b802021-03-22T12:49:20ZengElsevierComputational and Structural Biotechnology Journal2001-03702021-01-011915121530Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learningJianfeng Sun0Dmitrij Frishman1Department of Bioinformatics, Wissenschaftzentrum Weihenstephan, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354 Freising, GermanyCorresponding author.; Department of Bioinformatics, Wissenschaftzentrum Weihenstephan, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354 Freising, GermanyInteractions between transmembrane (TM) proteins are fundamental for a wide spectrum of cellular functions, but precise molecular details of these interactions remain largely unknown due to the scarcity of experimentally determined three-dimensional complex structures. Computational techniques are therefore required for a large-scale annotation of interaction sites in TM proteins. Here, we present a novel deep-learning approach, DeepTMInter, for sequence-based prediction of interaction sites in α-helical TM proteins based on their topological, physiochemical, and evolutionary properties. Using a combination of ultra-deep residual neural networks with a stacked generalization ensemble technique DeepTMInter significantly outperforms existing methods, achieving the AUC/AUCPR values of 0.689/0.598. Across the main functional families of human transmembrane proteins, the percentage of amino acid sites predicted to be involved in interactions typically ranges between 10% and 25%, and up to 30% in ion channels. DeepTMInter is available as a standalone package at https://github.com/2003100127/deeptminter. The training and benchmarking datasets are available at https://data.mendeley.com/datasets/2t8kgwzp35.http://www.sciencedirect.com/science/article/pii/S2001037021000775Protein-protein interactionsProtein structureProtein functionMolecular evolutionSequence annotationDeep learning
collection DOAJ
language English
format Article
sources DOAJ
author Jianfeng Sun
Dmitrij Frishman
spellingShingle Jianfeng Sun
Dmitrij Frishman
Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning
Computational and Structural Biotechnology Journal
Protein-protein interactions
Protein structure
Protein function
Molecular evolution
Sequence annotation
Deep learning
author_facet Jianfeng Sun
Dmitrij Frishman
author_sort Jianfeng Sun
title Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning
title_short Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning
title_full Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning
title_fullStr Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning
title_full_unstemmed Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning
title_sort improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning
publisher Elsevier
series Computational and Structural Biotechnology Journal
issn 2001-0370
publishDate 2021-01-01
description Interactions between transmembrane (TM) proteins are fundamental for a wide spectrum of cellular functions, but precise molecular details of these interactions remain largely unknown due to the scarcity of experimentally determined three-dimensional complex structures. Computational techniques are therefore required for a large-scale annotation of interaction sites in TM proteins. Here, we present a novel deep-learning approach, DeepTMInter, for sequence-based prediction of interaction sites in α-helical TM proteins based on their topological, physiochemical, and evolutionary properties. Using a combination of ultra-deep residual neural networks with a stacked generalization ensemble technique DeepTMInter significantly outperforms existing methods, achieving the AUC/AUCPR values of 0.689/0.598. Across the main functional families of human transmembrane proteins, the percentage of amino acid sites predicted to be involved in interactions typically ranges between 10% and 25%, and up to 30% in ion channels. DeepTMInter is available as a standalone package at https://github.com/2003100127/deeptminter. The training and benchmarking datasets are available at https://data.mendeley.com/datasets/2t8kgwzp35.
topic Protein-protein interactions
Protein structure
Protein function
Molecular evolution
Sequence annotation
Deep learning
url http://www.sciencedirect.com/science/article/pii/S2001037021000775
work_keys_str_mv AT jianfengsun improvedsequencebasedpredictionofinteractionsitesinahelicaltransmembraneproteinsbydeeplearning
AT dmitrijfrishman improvedsequencebasedpredictionofinteractionsitesinahelicaltransmembraneproteinsbydeeplearning
_version_ 1724207776514703360