Summary: | Most of the parasites of the phylum Apicomplexa contain a relict prokaryotic-derived plastid called the apicoplast. This organelle is important not only for the survival of the parasite, but its unique properties make it an ideal drug target. The majority of apicoplast-associated proteins are nuclear encoded and targeted post-translationally to the organellar lumen via a bipartite signaling mechanism that requires an N-terminal signal and transit peptide (TP). Attempts to define a consensus motif that universally identifies apicoplast TPs have failed.In this study, we propose a generalized rule-based classification model to identify apicoplast-targeted proteins (ApicoTPs) that use a bipartite signaling mechanism. Given a training set specific to an organism, this model, called ApicoAP, incorporates a procedure based on a genetic algorithm to tailor a discriminating rule that exploits the known characteristics of ApicoTPs. Performance of ApicoAP is evaluated for four labeled datasets of Plasmodium falciparum, Plasmodium yoelii, Babesia bovis, and Toxoplasma gondii proteins. ApicoAP improves the classification accuracy of the published dataset for P. falciparum to 94%, originally 90% using PlasmoAP.We present a parametric model for ApicoTPs and a procedure to optimize the model parameters for a given training set. A major asset of this model is that it is customizable to different parasite genomes. The ApicoAP prediction software is available at http://code.google.com/p/apicoap/ and http://bcb.eecs.wsu.edu.
|