Network approaches for exploring predicted proteins of unknown function in the sequenced genome of plant pathogenic fungi

Diseases caused by plant pathogenic fungi have a major impact on crop production leading to local or even global food and feed shortages. In addition, secondary metabolites produced by several pathogenic fungi can cause serious health problems in animals or even humans. Therefore, the identification...

Full description

Bibliographic Details
Main Author: Janowska-Sejda, Elzbieta Iwona
Other Authors: Tsoka, Sophia
Published: King's College London (University of London) 2018
Subjects:
004
Online Access:https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.754920
Description
Summary:Diseases caused by plant pathogenic fungi have a major impact on crop production leading to local or even global food and feed shortages. In addition, secondary metabolites produced by several pathogenic fungi can cause serious health problems in animals or even humans. Therefore, the identification of virulence genes in plant pathogenic fungi is of huge importance since it would help to understand the infection process and aid in the development of control strategies. With the advent of new sequencing technologies, the number of available whole-genome sequences and predicted proteomes is rapidly increasing for numerous plant pathogenic fungi. However, there are still many proteins with no assigned molecular function. At the same time, high-quality classification of protein families/domains and mutant phenotype information is increasingly available from databases such as PFAM and the Pathogen-Host Interactions database (PHI-base), respectively. The main objective of this work is to explore in depth proteins of unknown function and thereby speculate on their roles in virulence. In this study, various computational network approaches have been applied to integrate available biological data for selected eukaryotic pathogens. The Markov Cluster Algorithm was implemented to detect plant pathogen-specific and animal pathogen-specific gene clusters. Further, a neighbourhood-based network analysis approach was combined with a domain-domain interaction (DDI) and interologs high confidence network analysis to predict candidate genes for virulence in a globally important cereal-infecting and mycotoxin producing plant pathogenic fungus, namely Fusarium graminearum. Collectively, these analyses newly assigned 65 proteins a role in virulence. Most of those predicted proteins are thought to be a part of the Mitogen-activated protein kinase signaling pathways activated in F. graminearum during wheat ear infection. One gene, namely FGSG_06444, was identified to be a high-priority candidate for further biological experiments. Another new computational approach carried out in this work was the application of the domain-association network to the functional prediction of Domains of Unknown Function (DUFs). Here available phenotypic data for gene mutants curated in the PHI-base was integrated with taxonomic information, as well as topological properties of protein domains. Results from this novel analysis rejected the hypothesis that certain DUFs are linked to the virulence process of fungal plant pathogens. However, several DUFs were assigned a role in core metabolism (essential for life proteins) instead. Furthermore, a taxonomical diversity study of domains and Louvain community clustering identified 35 DUFs to be fungal-specific domains. A novel life-strategy-integration-analysis was developed where biological information from species employing saprophytic, heterotrophic and biotrophic lifestyles can be integrated into the one platform. This was achieved by combining a Protein Bigrams Overlap Network approach with SimMod analysis. Here two M. oryzae proteins (MGG_09419 and MGG_03468) were identified as novel effector protein candidates and six additional F. graminearum proteins were identified as members of polyketide synthase (PKS) and non-ribosomal peptide synthetase (NRPS) secondary metabolite pathways (FGSG_00036, FGSG_04588, FGSG_05321, FGSG_10464, FGSG_17387 and FGSG_17677). These genes are likely to be involved in virulence and were suggested to Rothamsted experimental biologists to be tested in gene deletion experiments to confirm their function. Overall, this study has implemented different approaches to investigate and assign molecular and biological functions to unannotated proteins in plant pathogenic fungi. By employing graph theory to integrate and analyse functional domain information from PFAM database and mutant phenotype information from PHI-base, it is possible to identify candidate genes responsible for virulence in fungal pathogens.