Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA

Abstract Background Apoptosis, also called programmed cell death, refers to the spontaneous and orderly death of cells controlled by genes in order to maintain a stable internal environment. Identifying the subcellular location of apoptosis proteins is very helpful in understanding the mechanism of...

Full description

Bibliographic Details
Main Authors: Lei Du, Qingfang Meng, Yuehui Chen, Peng Wu
Format: Article
Language:English
Published: BMC 2020-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-3539-1
id doaj-16dc80806a5c435e98a9fb337cdce9b4
record_format Article
spelling doaj-16dc80806a5c435e98a9fb337cdce9b42020-11-25T03:05:34ZengBMCBMC Bioinformatics1471-21052020-05-0121111910.1186/s12859-020-3539-1Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDALei Du0Qingfang Meng1Yuehui Chen2Peng Wu3School of Information Science and Engineering, University of JinanSchool of Information Science and Engineering, University of JinanSchool of Information Science and Engineering, University of JinanSchool of Information Science and Engineering, University of JinanAbstract Background Apoptosis, also called programmed cell death, refers to the spontaneous and orderly death of cells controlled by genes in order to maintain a stable internal environment. Identifying the subcellular location of apoptosis proteins is very helpful in understanding the mechanism of apoptosis and designing drugs. Therefore, the subcellular localization of apoptosis proteins has attracted increased attention in computational biology. Effective feature extraction methods play a critical role in predicting the subcellular location of proteins. Results In this paper, we proposed two novel feature extraction methods based on evolutionary information. One of the features obtained the evolutionary information via the transition matrix of the consensus sequence (CTM). And the other utilized the evolutionary information from PSSM based on absolute entropy correlation analysis (AECA-PSSM). After fusing the two kinds of features, linear discriminant analysis (LDA) was used to reduce the dimension of the proposed features. Finally, the support vector machine (SVM) was adopted to predict the protein subcellular locations. The proposed CTM-AECA-PSSM-LDA subcellular location prediction method was evaluated using the CL317 dataset and ZW225 dataset. By jackknife test, the overall accuracy was 99.7% (CL317) and 95.6% (ZW225) respectively. Conclusions The experimental results show that the proposed method which is hopefully to be a complementary tool for the existing methods of subcellular localization, can effectively extract more abundant features of protein sequence and is feasible in predicting the subcellular location of apoptosis proteins.http://link.springer.com/article/10.1186/s12859-020-3539-1Subcellular locationPosition-specific scoring matrixConsensus sequenceAbsolute entropy correlation analysisLinear discriminant analysis
collection DOAJ
language English
format Article
sources DOAJ
author Lei Du
Qingfang Meng
Yuehui Chen
Peng Wu
spellingShingle Lei Du
Qingfang Meng
Yuehui Chen
Peng Wu
Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA
BMC Bioinformatics
Subcellular location
Position-specific scoring matrix
Consensus sequence
Absolute entropy correlation analysis
Linear discriminant analysis
author_facet Lei Du
Qingfang Meng
Yuehui Chen
Peng Wu
author_sort Lei Du
title Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA
title_short Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA
title_full Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA
title_fullStr Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA
title_full_unstemmed Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA
title_sort subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and lda
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2020-05-01
description Abstract Background Apoptosis, also called programmed cell death, refers to the spontaneous and orderly death of cells controlled by genes in order to maintain a stable internal environment. Identifying the subcellular location of apoptosis proteins is very helpful in understanding the mechanism of apoptosis and designing drugs. Therefore, the subcellular localization of apoptosis proteins has attracted increased attention in computational biology. Effective feature extraction methods play a critical role in predicting the subcellular location of proteins. Results In this paper, we proposed two novel feature extraction methods based on evolutionary information. One of the features obtained the evolutionary information via the transition matrix of the consensus sequence (CTM). And the other utilized the evolutionary information from PSSM based on absolute entropy correlation analysis (AECA-PSSM). After fusing the two kinds of features, linear discriminant analysis (LDA) was used to reduce the dimension of the proposed features. Finally, the support vector machine (SVM) was adopted to predict the protein subcellular locations. The proposed CTM-AECA-PSSM-LDA subcellular location prediction method was evaluated using the CL317 dataset and ZW225 dataset. By jackknife test, the overall accuracy was 99.7% (CL317) and 95.6% (ZW225) respectively. Conclusions The experimental results show that the proposed method which is hopefully to be a complementary tool for the existing methods of subcellular localization, can effectively extract more abundant features of protein sequence and is feasible in predicting the subcellular location of apoptosis proteins.
topic Subcellular location
Position-specific scoring matrix
Consensus sequence
Absolute entropy correlation analysis
Linear discriminant analysis
url http://link.springer.com/article/10.1186/s12859-020-3539-1
work_keys_str_mv AT leidu subcellularlocationpredictionofapoptosisproteinsusingtwonovelfeatureextractionmethodsbasedonevolutionaryinformationandlda
AT qingfangmeng subcellularlocationpredictionofapoptosisproteinsusingtwonovelfeatureextractionmethodsbasedonevolutionaryinformationandlda
AT yuehuichen subcellularlocationpredictionofapoptosisproteinsusingtwonovelfeatureextractionmethodsbasedonevolutionaryinformationandlda
AT pengwu subcellularlocationpredictionofapoptosisproteinsusingtwonovelfeatureextractionmethodsbasedonevolutionaryinformationandlda
_version_ 1724677753858424832