Calpain cleavage prediction using multiple kernel learning.
Calpain, an intracellular Ca²⁺-dependent cysteine protease, is known to play a role in a wide range of metabolic pathways through limited proteolysis of its substrates. However, only a limited number of these substrates are currently known, with the exact mechanism of substrate recognition and cleav...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2011-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC3086883?pdf=render |
id |
doaj-a2df3af651314009808a9e0aae3c4943 |
---|---|
record_format |
Article |
spelling |
doaj-a2df3af651314009808a9e0aae3c49432020-11-25T02:10:30ZengPublic Library of Science (PLoS)PLoS ONE1932-62032011-01-0165e1903510.1371/journal.pone.0019035Calpain cleavage prediction using multiple kernel learning.David A DuVerleYasuko OnoHiroyuki SorimachiHiroshi MamitsukaCalpain, an intracellular Ca²⁺-dependent cysteine protease, is known to play a role in a wide range of metabolic pathways through limited proteolysis of its substrates. However, only a limited number of these substrates are currently known, with the exact mechanism of substrate recognition and cleavage by calpain still largely unknown. While previous research has successfully applied standard machine-learning algorithms to accurately predict substrate cleavage by other similar types of proteases, their approach does not extend well to calpain, possibly due to its particular mode of proteolytic action and limited amount of experimental data. Through the use of Multiple Kernel Learning, a recent extension to the classic Support Vector Machine framework, we were able to train complex models based on rich, heterogeneous feature sets, leading to significantly improved prediction quality (6% over highest AUC score produced by state-of-the-art methods). In addition to producing a stronger machine-learning model for the prediction of calpain cleavage, we were able to highlight the importance and role of each feature of substrate sequences in defining specificity: primary sequence, secondary structure and solvent accessibility. Most notably, we showed there existed significant specificity differences across calpain sub-types, despite previous assumption to the contrary. Prediction accuracy was further successfully validated using, as an unbiased test set, mutated sequences of calpastatin (endogenous inhibitor of calpain) modified to no longer block calpain's proteolytic action. An online implementation of our prediction tool is available at http://calpain.org.http://europepmc.org/articles/PMC3086883?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
David A DuVerle Yasuko Ono Hiroyuki Sorimachi Hiroshi Mamitsuka |
spellingShingle |
David A DuVerle Yasuko Ono Hiroyuki Sorimachi Hiroshi Mamitsuka Calpain cleavage prediction using multiple kernel learning. PLoS ONE |
author_facet |
David A DuVerle Yasuko Ono Hiroyuki Sorimachi Hiroshi Mamitsuka |
author_sort |
David A DuVerle |
title |
Calpain cleavage prediction using multiple kernel learning. |
title_short |
Calpain cleavage prediction using multiple kernel learning. |
title_full |
Calpain cleavage prediction using multiple kernel learning. |
title_fullStr |
Calpain cleavage prediction using multiple kernel learning. |
title_full_unstemmed |
Calpain cleavage prediction using multiple kernel learning. |
title_sort |
calpain cleavage prediction using multiple kernel learning. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2011-01-01 |
description |
Calpain, an intracellular Ca²⁺-dependent cysteine protease, is known to play a role in a wide range of metabolic pathways through limited proteolysis of its substrates. However, only a limited number of these substrates are currently known, with the exact mechanism of substrate recognition and cleavage by calpain still largely unknown. While previous research has successfully applied standard machine-learning algorithms to accurately predict substrate cleavage by other similar types of proteases, their approach does not extend well to calpain, possibly due to its particular mode of proteolytic action and limited amount of experimental data. Through the use of Multiple Kernel Learning, a recent extension to the classic Support Vector Machine framework, we were able to train complex models based on rich, heterogeneous feature sets, leading to significantly improved prediction quality (6% over highest AUC score produced by state-of-the-art methods). In addition to producing a stronger machine-learning model for the prediction of calpain cleavage, we were able to highlight the importance and role of each feature of substrate sequences in defining specificity: primary sequence, secondary structure and solvent accessibility. Most notably, we showed there existed significant specificity differences across calpain sub-types, despite previous assumption to the contrary. Prediction accuracy was further successfully validated using, as an unbiased test set, mutated sequences of calpastatin (endogenous inhibitor of calpain) modified to no longer block calpain's proteolytic action. An online implementation of our prediction tool is available at http://calpain.org. |
url |
http://europepmc.org/articles/PMC3086883?pdf=render |
work_keys_str_mv |
AT davidaduverle calpaincleavagepredictionusingmultiplekernellearning AT yasukoono calpaincleavagepredictionusingmultiplekernellearning AT hiroyukisorimachi calpaincleavagepredictionusingmultiplekernellearning AT hiroshimamitsuka calpaincleavagepredictionusingmultiplekernellearning |
_version_ |
1724919289990873088 |