Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy

Riboswitches are cis-regulatory genetic elements that use an aptamer to control gene expression. Specificity to cognate ligand and diversity of such ligands have expanded the functional repertoire of riboswitches to mediate mounting apt responses to sudden metabolic demands and signal changes in env...

Full description

Bibliographic Details
Main Authors: Keshav Aditya R. Premkumar, Ramit Bharanikumar, Ashok Palaniappan
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-07-01
Series:Frontiers in Bioengineering and Biotechnology
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fbioe.2020.00808/full
id doaj-eca3db7e403e413a98839a8d7043f698
record_format Article
spelling doaj-eca3db7e403e413a98839a8d7043f6982020-11-25T02:59:53ZengFrontiers Media S.A.Frontiers in Bioengineering and Biotechnology2296-41852020-07-01810.3389/fbioe.2020.00808520990Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% AccuracyKeshav Aditya R. Premkumar0Ramit Bharanikumar1Ashok Palaniappan2MS Program in Computer Science, Department of Computer Science, College of Engineering and Applied Sciences, Stony Brook University, Stony Brook, NY, United StatesMS in Bioinformatics, Georgia Institute of Technology, Atlanta, GA, United StatesDepartment of Bioinformatics, School of Chemical and Biotechnology, SASTRA Deemed University, Thanjavur, IndiaRiboswitches are cis-regulatory genetic elements that use an aptamer to control gene expression. Specificity to cognate ligand and diversity of such ligands have expanded the functional repertoire of riboswitches to mediate mounting apt responses to sudden metabolic demands and signal changes in environmental conditions. Given their critical role in microbial life, riboswitch characterisation remains a challenging computational problem. Here we have addressed the issue with advanced deep learning frameworks, namely convolutional neural networks (CNN), and bidirectional recurrent neural networks (RNN) with Long Short-Term Memory (LSTM). Using a comprehensive dataset of 32 ligand classes and a stratified train-validate-test approach, we demonstrated the accurate performance of both the deep learning models (CNN and RNN) relative to conventional hyperparameter-optimized machine learning classifiers on all key performance metrics, including the ROC curve analysis. In particular, the bidirectional LSTM RNN emerged as the best-performing learning method for identifying the ligand-specificity of riboswitches with an accuracy >0.99 and macro-averaged F-score of 0.96. An additional attraction is that the deep learning models do not require prior feature engineering. A dynamic update functionality is built into the models to factor for the constant discovery of new riboswitches, and extend the predictive modeling to new classes. Our work would enable the design of genetic circuits with custom-tuned riboswitch aptamers that would effect precise translational control in synthetic biology. The associated software is available as an open-source Python package and standalone resource for use in genome annotation, synthetic biology, and biotechnology workflows.https://www.frontiersin.org/article/10.3389/fbioe.2020.00808/fullriboswitch familysynthetic biologymachine learningconvolutional neural networkrecurrent neural networkhyperparameter optimization
collection DOAJ
language English
format Article
sources DOAJ
author Keshav Aditya R. Premkumar
Ramit Bharanikumar
Ashok Palaniappan
spellingShingle Keshav Aditya R. Premkumar
Ramit Bharanikumar
Ashok Palaniappan
Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy
Frontiers in Bioengineering and Biotechnology
riboswitch family
synthetic biology
machine learning
convolutional neural network
recurrent neural network
hyperparameter optimization
author_facet Keshav Aditya R. Premkumar
Ramit Bharanikumar
Ashok Palaniappan
author_sort Keshav Aditya R. Premkumar
title Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy
title_short Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy
title_full Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy
title_fullStr Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy
title_full_unstemmed Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy
title_sort riboflow: using deep learning to classify riboswitches with ∼99% accuracy
publisher Frontiers Media S.A.
series Frontiers in Bioengineering and Biotechnology
issn 2296-4185
publishDate 2020-07-01
description Riboswitches are cis-regulatory genetic elements that use an aptamer to control gene expression. Specificity to cognate ligand and diversity of such ligands have expanded the functional repertoire of riboswitches to mediate mounting apt responses to sudden metabolic demands and signal changes in environmental conditions. Given their critical role in microbial life, riboswitch characterisation remains a challenging computational problem. Here we have addressed the issue with advanced deep learning frameworks, namely convolutional neural networks (CNN), and bidirectional recurrent neural networks (RNN) with Long Short-Term Memory (LSTM). Using a comprehensive dataset of 32 ligand classes and a stratified train-validate-test approach, we demonstrated the accurate performance of both the deep learning models (CNN and RNN) relative to conventional hyperparameter-optimized machine learning classifiers on all key performance metrics, including the ROC curve analysis. In particular, the bidirectional LSTM RNN emerged as the best-performing learning method for identifying the ligand-specificity of riboswitches with an accuracy >0.99 and macro-averaged F-score of 0.96. An additional attraction is that the deep learning models do not require prior feature engineering. A dynamic update functionality is built into the models to factor for the constant discovery of new riboswitches, and extend the predictive modeling to new classes. Our work would enable the design of genetic circuits with custom-tuned riboswitch aptamers that would effect precise translational control in synthetic biology. The associated software is available as an open-source Python package and standalone resource for use in genome annotation, synthetic biology, and biotechnology workflows.
topic riboswitch family
synthetic biology
machine learning
convolutional neural network
recurrent neural network
hyperparameter optimization
url https://www.frontiersin.org/article/10.3389/fbioe.2020.00808/full
work_keys_str_mv AT keshavadityarpremkumar riboflowusingdeeplearningtoclassifyriboswitcheswith99accuracy
AT ramitbharanikumar riboflowusingdeeplearningtoclassifyriboswitcheswith99accuracy
AT ashokpalaniappan riboflowusingdeeplearningtoclassifyriboswitcheswith99accuracy
_version_ 1724700491285266432