AllerHunter: a SVM-pairwise system for assessment of allergenicity and allergic cross-reactivity in proteins.

Allergy is a major health problem in industrialized countries. The number of transgenic food crops is growing rapidly creating the need for allergenicity assessment before they are introduced into human food chain. While existing bioinformatic methods have achieved good accuracies for highly conserv...

Full description

Bibliographic Details
Main Authors: Hon Cheng Muh, Joo Chuan Tong, Martti T Tammi
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2009-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC2689655?pdf=render
id doaj-a6a973c824bd4ed496bca3b0dad28a41
record_format Article
spelling doaj-a6a973c824bd4ed496bca3b0dad28a412020-11-25T01:15:21ZengPublic Library of Science (PLoS)PLoS ONE1932-62032009-01-0146e586110.1371/journal.pone.0005861AllerHunter: a SVM-pairwise system for assessment of allergenicity and allergic cross-reactivity in proteins.Hon Cheng MuhJoo Chuan TongMartti T TammiAllergy is a major health problem in industrialized countries. The number of transgenic food crops is growing rapidly creating the need for allergenicity assessment before they are introduced into human food chain. While existing bioinformatic methods have achieved good accuracies for highly conserved sequences, the discrimination of allergens and non-allergens from allergen-like non-allergen sequences remains difficult. We describe AllerHunter, a web-based computational system for the assessment of potential allergenicity and allergic cross-reactivity in proteins. It combines an iterative pairwise sequence similarity encoding scheme with SVM as the discriminating engine. The pairwise vectorization framework allows the system to model essential features in allergens that are involved in cross-reactivity, but not limited to distinct sets of physicochemical properties. The system was rigorously trained and tested using 1,356 known allergen and 13,449 putative non-allergen sequences. Extensive testing was performed for validation of the prediction models. The system is effective for distinguishing allergens and non-allergens from allergen-like non-allergen sequences. Testing results showed that AllerHunter, with a sensitivity of 83.4% and specificity of 96.4% (accuracy = 95.3%, area under the receiver operating characteristic curve AROC = 0.928+/-0.004 and Matthew's correlation coefficient MCC = 0.738), performs significantly better than a number of existing methods using an independent dataset of 1443 protein sequences. AllerHunter is available at (http://tiger.dbs.nus.edu.sg/AllerHunter).http://europepmc.org/articles/PMC2689655?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Hon Cheng Muh
Joo Chuan Tong
Martti T Tammi
spellingShingle Hon Cheng Muh
Joo Chuan Tong
Martti T Tammi
AllerHunter: a SVM-pairwise system for assessment of allergenicity and allergic cross-reactivity in proteins.
PLoS ONE
author_facet Hon Cheng Muh
Joo Chuan Tong
Martti T Tammi
author_sort Hon Cheng Muh
title AllerHunter: a SVM-pairwise system for assessment of allergenicity and allergic cross-reactivity in proteins.
title_short AllerHunter: a SVM-pairwise system for assessment of allergenicity and allergic cross-reactivity in proteins.
title_full AllerHunter: a SVM-pairwise system for assessment of allergenicity and allergic cross-reactivity in proteins.
title_fullStr AllerHunter: a SVM-pairwise system for assessment of allergenicity and allergic cross-reactivity in proteins.
title_full_unstemmed AllerHunter: a SVM-pairwise system for assessment of allergenicity and allergic cross-reactivity in proteins.
title_sort allerhunter: a svm-pairwise system for assessment of allergenicity and allergic cross-reactivity in proteins.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2009-01-01
description Allergy is a major health problem in industrialized countries. The number of transgenic food crops is growing rapidly creating the need for allergenicity assessment before they are introduced into human food chain. While existing bioinformatic methods have achieved good accuracies for highly conserved sequences, the discrimination of allergens and non-allergens from allergen-like non-allergen sequences remains difficult. We describe AllerHunter, a web-based computational system for the assessment of potential allergenicity and allergic cross-reactivity in proteins. It combines an iterative pairwise sequence similarity encoding scheme with SVM as the discriminating engine. The pairwise vectorization framework allows the system to model essential features in allergens that are involved in cross-reactivity, but not limited to distinct sets of physicochemical properties. The system was rigorously trained and tested using 1,356 known allergen and 13,449 putative non-allergen sequences. Extensive testing was performed for validation of the prediction models. The system is effective for distinguishing allergens and non-allergens from allergen-like non-allergen sequences. Testing results showed that AllerHunter, with a sensitivity of 83.4% and specificity of 96.4% (accuracy = 95.3%, area under the receiver operating characteristic curve AROC = 0.928+/-0.004 and Matthew's correlation coefficient MCC = 0.738), performs significantly better than a number of existing methods using an independent dataset of 1443 protein sequences. AllerHunter is available at (http://tiger.dbs.nus.edu.sg/AllerHunter).
url http://europepmc.org/articles/PMC2689655?pdf=render
work_keys_str_mv AT honchengmuh allerhunterasvmpairwisesystemforassessmentofallergenicityandallergiccrossreactivityinproteins
AT joochuantong allerhunterasvmpairwisesystemforassessmentofallergenicityandallergiccrossreactivityinproteins
AT marttittammi allerhunterasvmpairwisesystemforassessmentofallergenicityandallergiccrossreactivityinproteins
_version_ 1725153784618811392