GIFtS: annotation landscape analysis with GeneCards

<p>Abstract</p> <p>Background</p> <p>Gene annotation is a pivotal component in computational genomics, encompassing prediction of gene function, expression analysis, and sequence scrutiny. Hence, quantitative measures of the annotation landscape constitute a pertinent b...

Full description

Bibliographic Details
Main Authors: Dalah Irina, Strichman-Almashanu Liora, Stelzer Gil, Inger Aron, Harel Arye, Safran Marilyn, Lancet Doron
Format: Article
Language:English
Published: BMC 2009-10-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/10/348
id doaj-9eeae00ae85b43ce9a73b034e3b80d41
record_format Article
spelling doaj-9eeae00ae85b43ce9a73b034e3b80d412020-11-25T01:04:43ZengBMCBMC Bioinformatics1471-21052009-10-0110134810.1186/1471-2105-10-348GIFtS: annotation landscape analysis with GeneCardsDalah IrinaStrichman-Almashanu LioraStelzer GilInger AronHarel AryeSafran MarilynLancet Doron<p>Abstract</p> <p>Background</p> <p>Gene annotation is a pivotal component in computational genomics, encompassing prediction of gene function, expression analysis, and sequence scrutiny. Hence, quantitative measures of the annotation landscape constitute a pertinent bioinformatics tool. GeneCards<sup>® </sup>is a gene-centric compendium of rich annotative information for over 50,000 human gene entries, building upon 68 data sources, including Gene Ontology (GO), pathways, interactions, phenotypes, publications and many more.</p> <p>Results</p> <p>We present the GeneCards Inferred Functionality Score (GIFtS) which allows a quantitative assessment of a gene's annotation status, by exploiting the unique wealth and diversity of GeneCards information. The GIFtS tool, linked from the GeneCards home page, facilitates browsing the human genome by searching for the annotation level of a specified gene, retrieving a list of genes within a specified range of GIFtS value, obtaining random genes with a specific GIFtS value, and experimenting with the GIFtS weighting algorithm for a variety of annotation categories. The bimodal shape of the GIFtS distribution suggests a division of the human gene repertoire into two main groups: the high-GIFtS peak consists almost entirely of protein-coding genes; the low-GIFtS peak consists of genes from all of the categories. Cluster analysis of GIFtS annotation vectors provides the classification of gene groups by detailed positioning in the annotation arena. GIFtS also provide measures which enable the evaluation of the databases that serve as GeneCards sources. An inverse correlation is found (for GIFtS>25) between the number of genes annotated by each source, and the average GIFtS value of genes associated with that source. Three typical source prototypes are revealed by their GIFtS distribution: genome-wide sources, sources comprising mainly highly annotated genes, and sources comprising mainly poorly annotated genes. The degree of accumulated knowledge for a given gene measured by GIFtS was correlated (for GIFtS>30) with the number of publications for a gene, and with the seniority of this entry in the HGNC database.</p> <p>Conclusion</p> <p>GIFtS can be a valuable tool for computational procedures which analyze lists of large set of genes resulting from wet-lab or computational research. GIFtS may also assist the scientific community with identification of groups of uncharacterized genes for diverse applications, such as delineation of novel functions and charting unexplored areas of the human genome.</p> http://www.biomedcentral.com/1471-2105/10/348
collection DOAJ
language English
format Article
sources DOAJ
author Dalah Irina
Strichman-Almashanu Liora
Stelzer Gil
Inger Aron
Harel Arye
Safran Marilyn
Lancet Doron
spellingShingle Dalah Irina
Strichman-Almashanu Liora
Stelzer Gil
Inger Aron
Harel Arye
Safran Marilyn
Lancet Doron
GIFtS: annotation landscape analysis with GeneCards
BMC Bioinformatics
author_facet Dalah Irina
Strichman-Almashanu Liora
Stelzer Gil
Inger Aron
Harel Arye
Safran Marilyn
Lancet Doron
author_sort Dalah Irina
title GIFtS: annotation landscape analysis with GeneCards
title_short GIFtS: annotation landscape analysis with GeneCards
title_full GIFtS: annotation landscape analysis with GeneCards
title_fullStr GIFtS: annotation landscape analysis with GeneCards
title_full_unstemmed GIFtS: annotation landscape analysis with GeneCards
title_sort gifts: annotation landscape analysis with genecards
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2009-10-01
description <p>Abstract</p> <p>Background</p> <p>Gene annotation is a pivotal component in computational genomics, encompassing prediction of gene function, expression analysis, and sequence scrutiny. Hence, quantitative measures of the annotation landscape constitute a pertinent bioinformatics tool. GeneCards<sup>® </sup>is a gene-centric compendium of rich annotative information for over 50,000 human gene entries, building upon 68 data sources, including Gene Ontology (GO), pathways, interactions, phenotypes, publications and many more.</p> <p>Results</p> <p>We present the GeneCards Inferred Functionality Score (GIFtS) which allows a quantitative assessment of a gene's annotation status, by exploiting the unique wealth and diversity of GeneCards information. The GIFtS tool, linked from the GeneCards home page, facilitates browsing the human genome by searching for the annotation level of a specified gene, retrieving a list of genes within a specified range of GIFtS value, obtaining random genes with a specific GIFtS value, and experimenting with the GIFtS weighting algorithm for a variety of annotation categories. The bimodal shape of the GIFtS distribution suggests a division of the human gene repertoire into two main groups: the high-GIFtS peak consists almost entirely of protein-coding genes; the low-GIFtS peak consists of genes from all of the categories. Cluster analysis of GIFtS annotation vectors provides the classification of gene groups by detailed positioning in the annotation arena. GIFtS also provide measures which enable the evaluation of the databases that serve as GeneCards sources. An inverse correlation is found (for GIFtS>25) between the number of genes annotated by each source, and the average GIFtS value of genes associated with that source. Three typical source prototypes are revealed by their GIFtS distribution: genome-wide sources, sources comprising mainly highly annotated genes, and sources comprising mainly poorly annotated genes. The degree of accumulated knowledge for a given gene measured by GIFtS was correlated (for GIFtS>30) with the number of publications for a gene, and with the seniority of this entry in the HGNC database.</p> <p>Conclusion</p> <p>GIFtS can be a valuable tool for computational procedures which analyze lists of large set of genes resulting from wet-lab or computational research. GIFtS may also assist the scientific community with identification of groups of uncharacterized genes for diverse applications, such as delineation of novel functions and charting unexplored areas of the human genome.</p>
url http://www.biomedcentral.com/1471-2105/10/348
work_keys_str_mv AT dalahirina giftsannotationlandscapeanalysiswithgenecards
AT strichmanalmashanuliora giftsannotationlandscapeanalysiswithgenecards
AT stelzergil giftsannotationlandscapeanalysiswithgenecards
AT ingeraron giftsannotationlandscapeanalysiswithgenecards
AT harelarye giftsannotationlandscapeanalysiswithgenecards
AT safranmarilyn giftsannotationlandscapeanalysiswithgenecards
AT lancetdoron giftsannotationlandscapeanalysiswithgenecards
_version_ 1725196532727152640