iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences
Apoptosis proteins (APs) control normal tissue homeostasis by regulating the balance between cell proliferation and death. The function of APs is strongly related to their subcellular location. To date, computational methods have been reported that reliably identify the subcellular location of APs,...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2018-04-01
|
Series: | International Journal of Molecular Sciences |
Subjects: | |
Online Access: | http://www.mdpi.com/1422-0067/19/4/1190 |
id |
doaj-94074b0250ef456996df09b7b49c7427 |
---|---|
record_format |
Article |
spelling |
doaj-94074b0250ef456996df09b7b49c74272020-11-24T21:40:04ZengMDPI AGInternational Journal of Molecular Sciences1422-00672018-04-01194119010.3390/ijms19041190ijms19041190iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid SequencesYadong Tang0Lu Xie1Lanming Chen2Key Laboratory of Quality and Safety Risk Assessment for Aquatic Products on Storage and Preservation (Shanghai), China Ministry of Agriculture, College of Food Science and Technology, Shanghai Ocean University, Shanghai 201306, ChinaShanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai 201203, ChinaKey Laboratory of Quality and Safety Risk Assessment for Aquatic Products on Storage and Preservation (Shanghai), China Ministry of Agriculture, College of Food Science and Technology, Shanghai Ocean University, Shanghai 201306, ChinaApoptosis proteins (APs) control normal tissue homeostasis by regulating the balance between cell proliferation and death. The function of APs is strongly related to their subcellular location. To date, computational methods have been reported that reliably identify the subcellular location of APs, however, there is still room for improvement of the prediction accuracy. In this study, we developed a novel method named iAPSL-IF (identification of apoptosis protein subcellular location—integrative features), which is based on integrative features captured from Markov chains, physicochemical property matrices, and position-specific score matrices (PSSMs) of amino acid sequences. The matrices with different lengths were transformed into fixed-length feature vectors using an auto cross-covariance (ACC) method. An optimal subset of the features was chosen using a recursive feature elimination (RFE) algorithm method, and the sequences with these features were trained by a support vector machine (SVM) classifier. Based on three datasets ZD98, CL317, and ZW225, the iAPSL-IF was examined using a jackknife cross-validation test. The resulting data showed that the iAPSL-IF outperformed the known predictors reported in the literature: its overall accuracy on the three datasets was 98.98% (ZD98), 94.95% (CL317), and 97.33% (ZW225), respectively; the Matthews correlation coefficient, sensitivity, and specificity for several classes of subcellular location proteins (e.g., membrane proteins, cytoplasmic proteins, endoplasmic reticulum proteins, nuclear proteins, and secreted proteins) in the datasets were 0.92–1.0, 94.23–100%, and 97.07–100%, respectively. Overall, the results of this study provide a high throughput and sequence-based method for better identification of the subcellular location of APs, and facilitates further understanding of programmed cell death in organisms.http://www.mdpi.com/1422-0067/19/4/1190apoptosis proteinsMarkov chainsphysicochemical propertiesposition specific scoring matrixsupport vector machinerecursive feature elimination |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Yadong Tang Lu Xie Lanming Chen |
spellingShingle |
Yadong Tang Lu Xie Lanming Chen iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences International Journal of Molecular Sciences apoptosis proteins Markov chains physicochemical properties position specific scoring matrix support vector machine recursive feature elimination |
author_facet |
Yadong Tang Lu Xie Lanming Chen |
author_sort |
Yadong Tang |
title |
iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences |
title_short |
iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences |
title_full |
iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences |
title_fullStr |
iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences |
title_full_unstemmed |
iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences |
title_sort |
iapsl-if: identification of apoptosis protein subcellular location using integrative features captured from amino acid sequences |
publisher |
MDPI AG |
series |
International Journal of Molecular Sciences |
issn |
1422-0067 |
publishDate |
2018-04-01 |
description |
Apoptosis proteins (APs) control normal tissue homeostasis by regulating the balance between cell proliferation and death. The function of APs is strongly related to their subcellular location. To date, computational methods have been reported that reliably identify the subcellular location of APs, however, there is still room for improvement of the prediction accuracy. In this study, we developed a novel method named iAPSL-IF (identification of apoptosis protein subcellular location—integrative features), which is based on integrative features captured from Markov chains, physicochemical property matrices, and position-specific score matrices (PSSMs) of amino acid sequences. The matrices with different lengths were transformed into fixed-length feature vectors using an auto cross-covariance (ACC) method. An optimal subset of the features was chosen using a recursive feature elimination (RFE) algorithm method, and the sequences with these features were trained by a support vector machine (SVM) classifier. Based on three datasets ZD98, CL317, and ZW225, the iAPSL-IF was examined using a jackknife cross-validation test. The resulting data showed that the iAPSL-IF outperformed the known predictors reported in the literature: its overall accuracy on the three datasets was 98.98% (ZD98), 94.95% (CL317), and 97.33% (ZW225), respectively; the Matthews correlation coefficient, sensitivity, and specificity for several classes of subcellular location proteins (e.g., membrane proteins, cytoplasmic proteins, endoplasmic reticulum proteins, nuclear proteins, and secreted proteins) in the datasets were 0.92–1.0, 94.23–100%, and 97.07–100%, respectively. Overall, the results of this study provide a high throughput and sequence-based method for better identification of the subcellular location of APs, and facilitates further understanding of programmed cell death in organisms. |
topic |
apoptosis proteins Markov chains physicochemical properties position specific scoring matrix support vector machine recursive feature elimination |
url |
http://www.mdpi.com/1422-0067/19/4/1190 |
work_keys_str_mv |
AT yadongtang iapslifidentificationofapoptosisproteinsubcellularlocationusingintegrativefeaturescapturedfromaminoacidsequences AT luxie iapslifidentificationofapoptosisproteinsubcellularlocationusingintegrativefeaturescapturedfromaminoacidsequences AT lanmingchen iapslifidentificationofapoptosisproteinsubcellularlocationusingintegrativefeaturescapturedfromaminoacidsequences |
_version_ |
1725928310856220672 |