Assessment of protein set coherence using functional annotations

<p>Abstract</p> <p>Background</p> <p>Analysis of large-scale experimental datasets frequently produces one or more sets of proteins that are subsequently mined for functional interpretation and validation. To this end, a number of computational methods have been devised...

Full description

Bibliographic Details
Main Authors: Carazo Jose M, Chagoyen Monica, Pascual-Montano Alberto
Format: Article
Language:English
Published: BMC 2008-10-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/9/444
id doaj-2960383bb17949f986d351c489f5f306
record_format Article
spelling doaj-2960383bb17949f986d351c489f5f3062020-11-25T01:18:24ZengBMCBMC Bioinformatics1471-21052008-10-019144410.1186/1471-2105-9-444Assessment of protein set coherence using functional annotationsCarazo Jose MChagoyen MonicaPascual-Montano Alberto<p>Abstract</p> <p>Background</p> <p>Analysis of large-scale experimental datasets frequently produces one or more sets of proteins that are subsequently mined for functional interpretation and validation. To this end, a number of computational methods have been devised that rely on the analysis of functional annotations. Although current methods provide valuable information (e.g. significantly enriched annotations, pairwise functional similarities), they do not specifically measure the degree of homogeneity of a protein set.</p> <p>Results</p> <p>In this work we present a method that scores the degree of functional homogeneity, or coherence, of a set of proteins on the basis of the global similarity of their functional annotations. The method uses statistical hypothesis testing to assess the significance of the set in the context of the functional space of a reference set. As such, it can be used as a first step in the validation of sets expected to be homogeneous prior to further functional interpretation.</p> <p>Conclusion</p> <p>We evaluate our method by analysing known biologically relevant sets as well as random ones. The known relevant sets comprise macromolecular complexes, cellular components and pathways described for <it>Saccharomyces cerevisiae</it>, which are mostly significantly coherent. Finally, we illustrate the usefulness of our approach for validating 'functional modules' obtained from computational analysis of protein-protein interaction networks. Matlab code and supplementary data are available at <url>http://www.cnb.csic.es/~monica/coherence/</url></p> http://www.biomedcentral.com/1471-2105/9/444
collection DOAJ
language English
format Article
sources DOAJ
author Carazo Jose M
Chagoyen Monica
Pascual-Montano Alberto
spellingShingle Carazo Jose M
Chagoyen Monica
Pascual-Montano Alberto
Assessment of protein set coherence using functional annotations
BMC Bioinformatics
author_facet Carazo Jose M
Chagoyen Monica
Pascual-Montano Alberto
author_sort Carazo Jose M
title Assessment of protein set coherence using functional annotations
title_short Assessment of protein set coherence using functional annotations
title_full Assessment of protein set coherence using functional annotations
title_fullStr Assessment of protein set coherence using functional annotations
title_full_unstemmed Assessment of protein set coherence using functional annotations
title_sort assessment of protein set coherence using functional annotations
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2008-10-01
description <p>Abstract</p> <p>Background</p> <p>Analysis of large-scale experimental datasets frequently produces one or more sets of proteins that are subsequently mined for functional interpretation and validation. To this end, a number of computational methods have been devised that rely on the analysis of functional annotations. Although current methods provide valuable information (e.g. significantly enriched annotations, pairwise functional similarities), they do not specifically measure the degree of homogeneity of a protein set.</p> <p>Results</p> <p>In this work we present a method that scores the degree of functional homogeneity, or coherence, of a set of proteins on the basis of the global similarity of their functional annotations. The method uses statistical hypothesis testing to assess the significance of the set in the context of the functional space of a reference set. As such, it can be used as a first step in the validation of sets expected to be homogeneous prior to further functional interpretation.</p> <p>Conclusion</p> <p>We evaluate our method by analysing known biologically relevant sets as well as random ones. The known relevant sets comprise macromolecular complexes, cellular components and pathways described for <it>Saccharomyces cerevisiae</it>, which are mostly significantly coherent. Finally, we illustrate the usefulness of our approach for validating 'functional modules' obtained from computational analysis of protein-protein interaction networks. Matlab code and supplementary data are available at <url>http://www.cnb.csic.es/~monica/coherence/</url></p>
url http://www.biomedcentral.com/1471-2105/9/444
work_keys_str_mv AT carazojosem assessmentofproteinsetcoherenceusingfunctionalannotations
AT chagoyenmonica assessmentofproteinsetcoherenceusingfunctionalannotations
AT pascualmontanoalberto assessmentofproteinsetcoherenceusingfunctionalannotations
_version_ 1725142748678324224