iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences

Apoptosis proteins (APs) control normal tissue homeostasis by regulating the balance between cell proliferation and death. The function of APs is strongly related to their subcellular location. To date, computational methods have been reported that reliably identify the subcellular location of APs,...

Full description

Bibliographic Details
Main Authors: Yadong Tang, Lu Xie, Lanming Chen
Format: Article
Language:English
Published: MDPI AG 2018-04-01
Series:International Journal of Molecular Sciences
Subjects:
Online Access:http://www.mdpi.com/1422-0067/19/4/1190
id doaj-94074b0250ef456996df09b7b49c7427
record_format Article
spelling doaj-94074b0250ef456996df09b7b49c74272020-11-24T21:40:04ZengMDPI AGInternational Journal of Molecular Sciences1422-00672018-04-01194119010.3390/ijms19041190ijms19041190iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid SequencesYadong Tang0Lu Xie1Lanming Chen2Key Laboratory of Quality and Safety Risk Assessment for Aquatic Products on Storage and Preservation (Shanghai), China Ministry of Agriculture, College of Food Science and Technology, Shanghai Ocean University, Shanghai 201306, ChinaShanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai 201203, ChinaKey Laboratory of Quality and Safety Risk Assessment for Aquatic Products on Storage and Preservation (Shanghai), China Ministry of Agriculture, College of Food Science and Technology, Shanghai Ocean University, Shanghai 201306, ChinaApoptosis proteins (APs) control normal tissue homeostasis by regulating the balance between cell proliferation and death. The function of APs is strongly related to their subcellular location. To date, computational methods have been reported that reliably identify the subcellular location of APs, however, there is still room for improvement of the prediction accuracy. In this study, we developed a novel method named iAPSL-IF (identification of apoptosis protein subcellular location—integrative features), which is based on integrative features captured from Markov chains, physicochemical property matrices, and position-specific score matrices (PSSMs) of amino acid sequences. The matrices with different lengths were transformed into fixed-length feature vectors using an auto cross-covariance (ACC) method. An optimal subset of the features was chosen using a recursive feature elimination (RFE) algorithm method, and the sequences with these features were trained by a support vector machine (SVM) classifier. Based on three datasets ZD98, CL317, and ZW225, the iAPSL-IF was examined using a jackknife cross-validation test. The resulting data showed that the iAPSL-IF outperformed the known predictors reported in the literature: its overall accuracy on the three datasets was 98.98% (ZD98), 94.95% (CL317), and 97.33% (ZW225), respectively; the Matthews correlation coefficient, sensitivity, and specificity for several classes of subcellular location proteins (e.g., membrane proteins, cytoplasmic proteins, endoplasmic reticulum proteins, nuclear proteins, and secreted proteins) in the datasets were 0.92–1.0, 94.23–100%, and 97.07–100%, respectively. Overall, the results of this study provide a high throughput and sequence-based method for better identification of the subcellular location of APs, and facilitates further understanding of programmed cell death in organisms.http://www.mdpi.com/1422-0067/19/4/1190apoptosis proteinsMarkov chainsphysicochemical propertiesposition specific scoring matrixsupport vector machinerecursive feature elimination
collection DOAJ
language English
format Article
sources DOAJ
author Yadong Tang
Lu Xie
Lanming Chen
spellingShingle Yadong Tang
Lu Xie
Lanming Chen
iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences
International Journal of Molecular Sciences
apoptosis proteins
Markov chains
physicochemical properties
position specific scoring matrix
support vector machine
recursive feature elimination
author_facet Yadong Tang
Lu Xie
Lanming Chen
author_sort Yadong Tang
title iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences
title_short iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences
title_full iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences
title_fullStr iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences
title_full_unstemmed iAPSL-IF: Identification of Apoptosis Protein Subcellular Location Using Integrative Features Captured from Amino Acid Sequences
title_sort iapsl-if: identification of apoptosis protein subcellular location using integrative features captured from amino acid sequences
publisher MDPI AG
series International Journal of Molecular Sciences
issn 1422-0067
publishDate 2018-04-01
description Apoptosis proteins (APs) control normal tissue homeostasis by regulating the balance between cell proliferation and death. The function of APs is strongly related to their subcellular location. To date, computational methods have been reported that reliably identify the subcellular location of APs, however, there is still room for improvement of the prediction accuracy. In this study, we developed a novel method named iAPSL-IF (identification of apoptosis protein subcellular location—integrative features), which is based on integrative features captured from Markov chains, physicochemical property matrices, and position-specific score matrices (PSSMs) of amino acid sequences. The matrices with different lengths were transformed into fixed-length feature vectors using an auto cross-covariance (ACC) method. An optimal subset of the features was chosen using a recursive feature elimination (RFE) algorithm method, and the sequences with these features were trained by a support vector machine (SVM) classifier. Based on three datasets ZD98, CL317, and ZW225, the iAPSL-IF was examined using a jackknife cross-validation test. The resulting data showed that the iAPSL-IF outperformed the known predictors reported in the literature: its overall accuracy on the three datasets was 98.98% (ZD98), 94.95% (CL317), and 97.33% (ZW225), respectively; the Matthews correlation coefficient, sensitivity, and specificity for several classes of subcellular location proteins (e.g., membrane proteins, cytoplasmic proteins, endoplasmic reticulum proteins, nuclear proteins, and secreted proteins) in the datasets were 0.92–1.0, 94.23–100%, and 97.07–100%, respectively. Overall, the results of this study provide a high throughput and sequence-based method for better identification of the subcellular location of APs, and facilitates further understanding of programmed cell death in organisms.
topic apoptosis proteins
Markov chains
physicochemical properties
position specific scoring matrix
support vector machine
recursive feature elimination
url http://www.mdpi.com/1422-0067/19/4/1190
work_keys_str_mv AT yadongtang iapslifidentificationofapoptosisproteinsubcellularlocationusingintegrativefeaturescapturedfromaminoacidsequences
AT luxie iapslifidentificationofapoptosisproteinsubcellularlocationusingintegrativefeaturescapturedfromaminoacidsequences
AT lanmingchen iapslifidentificationofapoptosisproteinsubcellularlocationusingintegrativefeaturescapturedfromaminoacidsequences
_version_ 1725928310856220672