IMR Press / FBL / Volume 13 / Issue 16 / DOI: 10.2741/3138

Frontiers in Bioscience-Landmark (FBL) is published by IMR Press from Volume 26 Issue 5 (2021). Previous articles were published by another publisher on a subscription basis, and they are hosted by IMR Press on imrpress.com as a courtesy and upon agreement with Frontiers in Bioscience.

Article
Prediction of protein allergenicity using local description of amino acid sequence
Show Less
1 Data Mining Department, Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613
2 Department of Biological Sciences and Department of Biochemistry, National University of Singapore, 14 Science Drive 4, Singapore 117543
3 Genome Institute, National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Paholyothin Rd., Klong 1, Klong Luang, Pathumthani, 12120 Thailand
4 Karolinska Institutet, Department of Microbiology, Tumor and Cell Biology, Stockholm, Sweden

*Author to whom correspondence should be addressed.

 

Front. Biosci. (Landmark Ed) 2008, 13(16), 6072–6078; https://doi.org/10.2741/3138
Published: 1 May 2008
Abstract

The constant increase in atopic allergy and other hypersensitivity reactions has intensified the need for successful therapeutic approaches. Existing bioinformatic tools for predicting allergenic potential are primarily based on sequence similarity searches along the entire protein sequence and do not address the dual issues of conformational and overlapping B-cell epitope recognition sites. In this study, we report AllerPred, a computational system that is capable of capturing multiple overlapping continuous and discontinuous B-cell epitope binding patterns in allergenic proteins using SVM as its prediction engine. A novel representation of local protein sequence descriptors enables the system to model multiple overlapping continuous and discontinuous B-cell epitope binding patterns within a protein sequence. The model was rigorously trained and tested using 669 IUIS allergens and 1237 non-allergens. Testing results showed that the area under the receiver operating curve (AROC) of SVM models is 0.81 with 76% sensitivity at specificity of 76%. This approach consistently outperforms existing allergenicity prediction systems using a standardized testing dataset of experimentally validated allergens and non-allergen sequences.

Share
Back to top