Transfer learning via multi-scale convolutional neural layers for human-virus protein-protein interaction prediction

Motivation: To complement experimental efforts, machine learning-based computational methods are playing an increasingly important role to predict human-virus protein-protein interactions (PPIs). Furthermore, transfer learning can effectively apply prior knowledge obtained from a large source datase...

Full description

Bibliographic Details
Main Authors: Lian, X. (Author), Wuchty, S. (Author), Yang, S. (Author), Yang, X. (Author), Zhang, Z. (Author)
Format: Article
Language:English
Published: Oxford University Press 2021
Online Access:View Fulltext in Publisher
Description
Summary:Motivation: To complement experimental efforts, machine learning-based computational methods are playing an increasingly important role to predict human-virus protein-protein interactions (PPIs). Furthermore, transfer learning can effectively apply prior knowledge obtained from a large source dataset/task to a small target dataset/task, improving prediction performance. Results: To predict interactions between human and viral proteins, we combine evolutionary sequence profile features with a Siamese convolutional neural network (CNN) architecture and a multi-layer perceptron. Our architecture outperforms various feature encodings-based machine learning and state-of-the-art prediction methods. As our main contribution, we introduce two transfer learning methods (i.e. 'frozen' type and 'fine-tuning' type) that reliably predict interactions in a target human-virus domain based on training in a source human-virus domain, by retraining CNN layers. Finally, we utilize the 'frozen' type transfer learning approach to predict human-SARS-CoV-2 PPIs, indicating that our predictions are topologically and functionally similar to experimentally known interactions. © 2021 The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
ISBN:13674803 (ISSN)
DOI:10.1093/bioinformatics/btab533