Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy
Riboswitches are cis-regulatory genetic elements that use an aptamer to control gene expression. Specificity to cognate ligand and diversity of such ligands have expanded the functional repertoire of riboswitches to mediate mounting apt responses to sudden metabolic demands and signal changes in env...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2020-07-01
|
Series: | Frontiers in Bioengineering and Biotechnology |
Subjects: | |
Online Access: | https://www.frontiersin.org/article/10.3389/fbioe.2020.00808/full |
id |
doaj-eca3db7e403e413a98839a8d7043f698 |
---|---|
record_format |
Article |
spelling |
doaj-eca3db7e403e413a98839a8d7043f6982020-11-25T02:59:53ZengFrontiers Media S.A.Frontiers in Bioengineering and Biotechnology2296-41852020-07-01810.3389/fbioe.2020.00808520990Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% AccuracyKeshav Aditya R. Premkumar0Ramit Bharanikumar1Ashok Palaniappan2MS Program in Computer Science, Department of Computer Science, College of Engineering and Applied Sciences, Stony Brook University, Stony Brook, NY, United StatesMS in Bioinformatics, Georgia Institute of Technology, Atlanta, GA, United StatesDepartment of Bioinformatics, School of Chemical and Biotechnology, SASTRA Deemed University, Thanjavur, IndiaRiboswitches are cis-regulatory genetic elements that use an aptamer to control gene expression. Specificity to cognate ligand and diversity of such ligands have expanded the functional repertoire of riboswitches to mediate mounting apt responses to sudden metabolic demands and signal changes in environmental conditions. Given their critical role in microbial life, riboswitch characterisation remains a challenging computational problem. Here we have addressed the issue with advanced deep learning frameworks, namely convolutional neural networks (CNN), and bidirectional recurrent neural networks (RNN) with Long Short-Term Memory (LSTM). Using a comprehensive dataset of 32 ligand classes and a stratified train-validate-test approach, we demonstrated the accurate performance of both the deep learning models (CNN and RNN) relative to conventional hyperparameter-optimized machine learning classifiers on all key performance metrics, including the ROC curve analysis. In particular, the bidirectional LSTM RNN emerged as the best-performing learning method for identifying the ligand-specificity of riboswitches with an accuracy >0.99 and macro-averaged F-score of 0.96. An additional attraction is that the deep learning models do not require prior feature engineering. A dynamic update functionality is built into the models to factor for the constant discovery of new riboswitches, and extend the predictive modeling to new classes. Our work would enable the design of genetic circuits with custom-tuned riboswitch aptamers that would effect precise translational control in synthetic biology. The associated software is available as an open-source Python package and standalone resource for use in genome annotation, synthetic biology, and biotechnology workflows.https://www.frontiersin.org/article/10.3389/fbioe.2020.00808/fullriboswitch familysynthetic biologymachine learningconvolutional neural networkrecurrent neural networkhyperparameter optimization |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Keshav Aditya R. Premkumar Ramit Bharanikumar Ashok Palaniappan |
spellingShingle |
Keshav Aditya R. Premkumar Ramit Bharanikumar Ashok Palaniappan Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy Frontiers in Bioengineering and Biotechnology riboswitch family synthetic biology machine learning convolutional neural network recurrent neural network hyperparameter optimization |
author_facet |
Keshav Aditya R. Premkumar Ramit Bharanikumar Ashok Palaniappan |
author_sort |
Keshav Aditya R. Premkumar |
title |
Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy |
title_short |
Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy |
title_full |
Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy |
title_fullStr |
Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy |
title_full_unstemmed |
Riboflow: Using Deep Learning to Classify Riboswitches With ∼99% Accuracy |
title_sort |
riboflow: using deep learning to classify riboswitches with ∼99% accuracy |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Bioengineering and Biotechnology |
issn |
2296-4185 |
publishDate |
2020-07-01 |
description |
Riboswitches are cis-regulatory genetic elements that use an aptamer to control gene expression. Specificity to cognate ligand and diversity of such ligands have expanded the functional repertoire of riboswitches to mediate mounting apt responses to sudden metabolic demands and signal changes in environmental conditions. Given their critical role in microbial life, riboswitch characterisation remains a challenging computational problem. Here we have addressed the issue with advanced deep learning frameworks, namely convolutional neural networks (CNN), and bidirectional recurrent neural networks (RNN) with Long Short-Term Memory (LSTM). Using a comprehensive dataset of 32 ligand classes and a stratified train-validate-test approach, we demonstrated the accurate performance of both the deep learning models (CNN and RNN) relative to conventional hyperparameter-optimized machine learning classifiers on all key performance metrics, including the ROC curve analysis. In particular, the bidirectional LSTM RNN emerged as the best-performing learning method for identifying the ligand-specificity of riboswitches with an accuracy >0.99 and macro-averaged F-score of 0.96. An additional attraction is that the deep learning models do not require prior feature engineering. A dynamic update functionality is built into the models to factor for the constant discovery of new riboswitches, and extend the predictive modeling to new classes. Our work would enable the design of genetic circuits with custom-tuned riboswitch aptamers that would effect precise translational control in synthetic biology. The associated software is available as an open-source Python package and standalone resource for use in genome annotation, synthetic biology, and biotechnology workflows. |
topic |
riboswitch family synthetic biology machine learning convolutional neural network recurrent neural network hyperparameter optimization |
url |
https://www.frontiersin.org/article/10.3389/fbioe.2020.00808/full |
work_keys_str_mv |
AT keshavadityarpremkumar riboflowusingdeeplearningtoclassifyriboswitcheswith99accuracy AT ramitbharanikumar riboflowusingdeeplearningtoclassifyriboswitcheswith99accuracy AT ashokpalaniappan riboflowusingdeeplearningtoclassifyriboswitcheswith99accuracy |
_version_ |
1724700491285266432 |