Summary: | Clustered regularly interspaced short palindromic repeat (CRISPR) technology is the most important tool in gene editing, it can be used to target any gene using guide RNA and Cas enzyme, one limitation of CRISPR systems is low guide RNA (gRNA) activity, therefore it is highly important to predict its gRNA activity. The activity of gRNA can be determined by measuring the score for the frequency of insertion or deletion (indel). In this work, CNN was optimized by changing the convolution layer depth and filter kernel size to determine how well the model will perform, also, we compared traditional Multiple Linear Regression (MLR), Convolutional Neural Network (CNN) and combine CNN with Support Vector Regressor (SVR) to form a hybrid model CNN-SVR for the prediction of gRNA activity. Based on the Spearman Correlation (SC) the hybrid model turns out to outperform state of the art model by an increase of up to 40% in predicting gRNA activity. Finally, we predicted the indel frequency for gRNA sequences used for detection of COVID-19 to validate the hybrid model, this will assist in choosing the best gRNA for detection COVID-19 virus using CRISPR/Cas12 system.
|