CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems.
The CRISPR/Cas9-sgRNA system has recently become a popular tool for genome editing and a very hot topic in the field of medical research. In this system, Cas9 protein is directed to a desired location for gene engineering and cleaves target DNA sequence which is complementary to a 20-nucleotide guid...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2017-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC5540555?pdf=render |
id |
doaj-199763b23e2744e1b330022eb4fe376e |
---|---|
record_format |
Article |
spelling |
doaj-199763b23e2744e1b330022eb4fe376e2020-11-25T01:24:05ZengPublic Library of Science (PLoS)PLoS ONE1932-62032017-01-01128e018194310.1371/journal.pone.0181943CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems.Md Khaledur RahmanM Sohel RahmanThe CRISPR/Cas9-sgRNA system has recently become a popular tool for genome editing and a very hot topic in the field of medical research. In this system, Cas9 protein is directed to a desired location for gene engineering and cleaves target DNA sequence which is complementary to a 20-nucleotide guide sequence found within the sgRNA. A lot of experimental efforts, ranging from in vivo selection to in silico modeling, have been made for efficient designing of sgRNAs in CRISPR/Cas9 system. In this article, we present a novel tool, called CRISPRpred, for efficient in silico prediction of sgRNAs on-target activity which is based on the applications of Support Vector Machine (SVM) model. To conduct experiments, we have used a benchmark dataset of 17 genes and 5310 guide sequences where there are only 20% true values. CRISPRpred achieves Area Under Receiver Operating Characteristics Curve (AUROC-Curve), Area Under Precision Recall Curve (AUPR-Curve) and maximum Matthews Correlation Coefficient (MCC) as 0.85, 0.56 and 0.48, respectively. Our tool shows approximately 5% improvement in AUPR-Curve and after analyzing all evaluation metrics, we find that CRISPRpred is better than the current state-of-the-art. CRISPRpred is enough flexible to extract relevant features and use them in a learning algorithm. The source code of our entire software with relevant dataset can be found in the following link: https://github.com/khaled-buet/CRISPRpred.http://europepmc.org/articles/PMC5540555?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Md Khaledur Rahman M Sohel Rahman |
spellingShingle |
Md Khaledur Rahman M Sohel Rahman CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems. PLoS ONE |
author_facet |
Md Khaledur Rahman M Sohel Rahman |
author_sort |
Md Khaledur Rahman |
title |
CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems. |
title_short |
CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems. |
title_full |
CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems. |
title_fullStr |
CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems. |
title_full_unstemmed |
CRISPRpred: A flexible and efficient tool for sgRNAs on-target activity prediction in CRISPR/Cas9 systems. |
title_sort |
crisprpred: a flexible and efficient tool for sgrnas on-target activity prediction in crispr/cas9 systems. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2017-01-01 |
description |
The CRISPR/Cas9-sgRNA system has recently become a popular tool for genome editing and a very hot topic in the field of medical research. In this system, Cas9 protein is directed to a desired location for gene engineering and cleaves target DNA sequence which is complementary to a 20-nucleotide guide sequence found within the sgRNA. A lot of experimental efforts, ranging from in vivo selection to in silico modeling, have been made for efficient designing of sgRNAs in CRISPR/Cas9 system. In this article, we present a novel tool, called CRISPRpred, for efficient in silico prediction of sgRNAs on-target activity which is based on the applications of Support Vector Machine (SVM) model. To conduct experiments, we have used a benchmark dataset of 17 genes and 5310 guide sequences where there are only 20% true values. CRISPRpred achieves Area Under Receiver Operating Characteristics Curve (AUROC-Curve), Area Under Precision Recall Curve (AUPR-Curve) and maximum Matthews Correlation Coefficient (MCC) as 0.85, 0.56 and 0.48, respectively. Our tool shows approximately 5% improvement in AUPR-Curve and after analyzing all evaluation metrics, we find that CRISPRpred is better than the current state-of-the-art. CRISPRpred is enough flexible to extract relevant features and use them in a learning algorithm. The source code of our entire software with relevant dataset can be found in the following link: https://github.com/khaled-buet/CRISPRpred. |
url |
http://europepmc.org/articles/PMC5540555?pdf=render |
work_keys_str_mv |
AT mdkhaledurrahman crisprpredaflexibleandefficienttoolforsgrnasontargetactivitypredictionincrisprcas9systems AT msohelrahman crisprpredaflexibleandefficienttoolforsgrnasontargetactivitypredictionincrisprcas9systems |
_version_ |
1725119006842552320 |