CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems.

The CRISPR/Cas9-sgRNA system has recently become a popular tool for genome editing and a very hot topic in the field of medical research. In this system, Cas9 protein is directed to a desired location for gene engineering and cleaves target DNA sequence which is complementary to a 20-nucleotide guid...

Full description

Bibliographic Details
Main Authors: Md Khaledur Rahman, M Sohel Rahman
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2017-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5540555?pdf=render
id doaj-199763b23e2744e1b330022eb4fe376e
record_format Article
spelling doaj-199763b23e2744e1b330022eb4fe376e2020-11-25T01:24:05ZengPublic Library of Science (PLoS)PLoS ONE1932-62032017-01-01128e018194310.1371/journal.pone.0181943CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems.Md Khaledur RahmanM Sohel RahmanThe CRISPR/Cas9-sgRNA system has recently become a popular tool for genome editing and a very hot topic in the field of medical research. In this system, Cas9 protein is directed to a desired location for gene engineering and cleaves target DNA sequence which is complementary to a 20-nucleotide guide sequence found within the sgRNA. A lot of experimental efforts, ranging from in vivo selection to in silico modeling, have been made for efficient designing of sgRNAs in CRISPR/Cas9 system. In this article, we present a novel tool, called CRISPRpred, for efficient in silico prediction of sgRNAs on-target activity which is based on the applications of Support Vector Machine (SVM) model. To conduct experiments, we have used a benchmark dataset of 17 genes and 5310 guide sequences where there are only 20% true values. CRISPRpred achieves Area Under Receiver Operating Characteristics Curve (AUROC-Curve), Area Under Precision Recall Curve (AUPR-Curve) and maximum Matthews Correlation Coefficient (MCC) as 0.85, 0.56 and 0.48, respectively. Our tool shows approximately 5% improvement in AUPR-Curve and after analyzing all evaluation metrics, we find that CRISPRpred is better than the current state-of-the-art. CRISPRpred is enough flexible to extract relevant features and use them in a learning algorithm. The source code of our entire software with relevant dataset can be found in the following link: https://github.com/khaled-buet/CRISPRpred.http://europepmc.org/articles/PMC5540555?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Md Khaledur Rahman
M Sohel Rahman
spellingShingle Md Khaledur Rahman
M Sohel Rahman
CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems.
PLoS ONE
author_facet Md Khaledur Rahman
M Sohel Rahman
author_sort Md Khaledur Rahman
title CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems.
title_short CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems.
title_full CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems.
title_fullStr CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems.
title_full_unstemmed CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems.
title_sort crisprpred: a flexible and efficient tool for sgrnas on-target activity prediction in crispr/cas9 systems.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2017-01-01
description The CRISPR/Cas9-sgRNA system has recently become a popular tool for genome editing and a very hot topic in the field of medical research. In this system, Cas9 protein is directed to a desired location for gene engineering and cleaves target DNA sequence which is complementary to a 20-nucleotide guide sequence found within the sgRNA. A lot of experimental efforts, ranging from in vivo selection to in silico modeling, have been made for efficient designing of sgRNAs in CRISPR/Cas9 system. In this article, we present a novel tool, called CRISPRpred, for efficient in silico prediction of sgRNAs on-target activity which is based on the applications of Support Vector Machine (SVM) model. To conduct experiments, we have used a benchmark dataset of 17 genes and 5310 guide sequences where there are only 20% true values. CRISPRpred achieves Area Under Receiver Operating Characteristics Curve (AUROC-Curve), Area Under Precision Recall Curve (AUPR-Curve) and maximum Matthews Correlation Coefficient (MCC) as 0.85, 0.56 and 0.48, respectively. Our tool shows approximately 5% improvement in AUPR-Curve and after analyzing all evaluation metrics, we find that CRISPRpred is better than the current state-of-the-art. CRISPRpred is enough flexible to extract relevant features and use them in a learning algorithm. The source code of our entire software with relevant dataset can be found in the following link: https://github.com/khaled-buet/CRISPRpred.
url http://europepmc.org/articles/PMC5540555?pdf=render
work_keys_str_mv AT mdkhaledurrahman crisprpredaflexibleandefficienttoolforsgrnasontargetactivitypredictionincrisprcas9systems
AT msohelrahman crisprpredaflexibleandefficienttoolforsgrnasontargetactivitypredictionincrisprcas9systems
_version_ 1725119006842552320