Prediction of amyloid fibril-forming segments based on a support vector machine

<p>Abstract</p> <p>Background</p> <p>Amyloid fibrillar aggregates of proteins or polypeptides are known to be associated with many human diseases. Recent studies suggest that short protein regions trigger this aggregation. Thus, identifying these short peptides is criti...

Full description

Bibliographic Details
Main Authors: Guo Jun, Wu Ningfeng, Tian Jian, Fan Yunliu
Format: Article
Language:English
Published: BMC 2009-01-01
Series:BMC Bioinformatics
id doaj-e8f96430a65245b7bf8563d409f21c90
record_format Article
spelling doaj-e8f96430a65245b7bf8563d409f21c902020-11-25T00:29:06ZengBMCBMC Bioinformatics1471-21052009-01-0110Suppl 1S4510.1186/1471-2105-10-S1-S45Prediction of amyloid fibril-forming segments based on a support vector machineGuo JunWu NingfengTian JianFan Yunliu<p>Abstract</p> <p>Background</p> <p>Amyloid fibrillar aggregates of proteins or polypeptides are known to be associated with many human diseases. Recent studies suggest that short protein regions trigger this aggregation. Thus, identifying these short peptides is critical for understanding diseases and finding potential therapeutic targets.</p> <p>Results</p> <p>We propose a method, named Pafig (Prediction of amyloid fibril-forming segments) based on support vector machines, to identify the hexpeptides associated with amyloid fibrillar aggregates. The features of Pafig were obtained by a two-round selection from AAindex. Using a 10-fold cross validation test on Hexpepset dataset, Pafig performed well with regards to overall accuracy of 81% and Matthews correlation coefficient of 0.63. Pafig was used to predict the potential fibril-forming hexpeptides in all of the 64,000,000 hexpeptides. As a result, approximately 5.08% of hexpeptides showed a high aggregation propensity. In the predicted fibril-forming hexpeptides, the amino acids – alanine, phenylalanine, isoleucine, leucine and valine occurred at the higher frequencies and the amino acids – aspartic acid, glutamic acid, histidine, lysine, arginine and praline, appeared with lower frequencies.</p> <p>Conclusion</p> <p>The performance of Pafig indicates that it is a powerful tool for identifying the hexpeptides associated with fibrillar aggregates and will be useful for large-scale analysis of proteomic data.</p>
collection DOAJ
language English
format Article
sources DOAJ
author Guo Jun
Wu Ningfeng
Tian Jian
Fan Yunliu
spellingShingle Guo Jun
Wu Ningfeng
Tian Jian
Fan Yunliu
Prediction of amyloid fibril-forming segments based on a support vector machine
BMC Bioinformatics
author_facet Guo Jun
Wu Ningfeng
Tian Jian
Fan Yunliu
author_sort Guo Jun
title Prediction of amyloid fibril-forming segments based on a support vector machine
title_short Prediction of amyloid fibril-forming segments based on a support vector machine
title_full Prediction of amyloid fibril-forming segments based on a support vector machine
title_fullStr Prediction of amyloid fibril-forming segments based on a support vector machine
title_full_unstemmed Prediction of amyloid fibril-forming segments based on a support vector machine
title_sort prediction of amyloid fibril-forming segments based on a support vector machine
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2009-01-01
description <p>Abstract</p> <p>Background</p> <p>Amyloid fibrillar aggregates of proteins or polypeptides are known to be associated with many human diseases. Recent studies suggest that short protein regions trigger this aggregation. Thus, identifying these short peptides is critical for understanding diseases and finding potential therapeutic targets.</p> <p>Results</p> <p>We propose a method, named Pafig (Prediction of amyloid fibril-forming segments) based on support vector machines, to identify the hexpeptides associated with amyloid fibrillar aggregates. The features of Pafig were obtained by a two-round selection from AAindex. Using a 10-fold cross validation test on Hexpepset dataset, Pafig performed well with regards to overall accuracy of 81% and Matthews correlation coefficient of 0.63. Pafig was used to predict the potential fibril-forming hexpeptides in all of the 64,000,000 hexpeptides. As a result, approximately 5.08% of hexpeptides showed a high aggregation propensity. In the predicted fibril-forming hexpeptides, the amino acids – alanine, phenylalanine, isoleucine, leucine and valine occurred at the higher frequencies and the amino acids – aspartic acid, glutamic acid, histidine, lysine, arginine and praline, appeared with lower frequencies.</p> <p>Conclusion</p> <p>The performance of Pafig indicates that it is a powerful tool for identifying the hexpeptides associated with fibrillar aggregates and will be useful for large-scale analysis of proteomic data.</p>
work_keys_str_mv AT guojun predictionofamyloidfibrilformingsegmentsbasedonasupportvectormachine
AT wuningfeng predictionofamyloidfibrilformingsegmentsbasedonasupportvectormachine
AT tianjian predictionofamyloidfibrilformingsegmentsbasedonasupportvectormachine
AT fanyunliu predictionofamyloidfibrilformingsegmentsbasedonasupportvectormachine
_version_ 1725333345159610368