Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments

<p>Abstract</p> <p>Background</p> <p>In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search,...

Full description

Bibliographic Details
Main Authors: Byrne Kevin P, Armisén David, ÓhÉigeartaigh Seán S, Wolfe Kenneth H
Format: Article
Language:English
Published: BMC 2011-07-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/12/377
id doaj-b2ac30d1323a40caad64a7f3a6d7c038
record_format Article
spelling doaj-b2ac30d1323a40caad64a7f3a6d7c0382020-11-24T21:53:00ZengBMCBMC Genomics1471-21642011-07-0112137710.1186/1471-2164-12-377Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segmentsByrne Kevin PArmisén DavidÓhÉigeartaigh Seán SWolfe Kenneth H<p>Abstract</p> <p>Background</p> <p>In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically.</p> <p>Results</p> <p>We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in <it>Saccharomyces cerevisiae</it>. We found additional genes for the mating pheromone a-factor in six species including <it>Kluyveromyces lactis</it>.</p> <p>Conclusions</p> <p>SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external information has been added may prove useful in other settings.</p> http://www.biomedcentral.com/1471-2164/12/377
collection DOAJ
language English
format Article
sources DOAJ
author Byrne Kevin P
Armisén David
ÓhÉigeartaigh Seán S
Wolfe Kenneth H
spellingShingle Byrne Kevin P
Armisén David
ÓhÉigeartaigh Seán S
Wolfe Kenneth H
Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments
BMC Genomics
author_facet Byrne Kevin P
Armisén David
ÓhÉigeartaigh Seán S
Wolfe Kenneth H
author_sort Byrne Kevin P
title Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments
title_short Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments
title_full Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments
title_fullStr Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments
title_full_unstemmed Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments
title_sort systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2011-07-01
description <p>Abstract</p> <p>Background</p> <p>In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically.</p> <p>Results</p> <p>We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in <it>Saccharomyces cerevisiae</it>. We found additional genes for the mating pheromone a-factor in six species including <it>Kluyveromyces lactis</it>.</p> <p>Conclusions</p> <p>SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external information has been added may prove useful in other settings.</p>
url http://www.biomedcentral.com/1471-2164/12/377
work_keys_str_mv AT byrnekevinp systematicdiscoveryofunannotatedgenesin11yeastspeciesusingadatabaseoforthologousgenomicsegments
AT armisendavid systematicdiscoveryofunannotatedgenesin11yeastspeciesusingadatabaseoforthologousgenomicsegments
AT oheigeartaighseans systematicdiscoveryofunannotatedgenesin11yeastspeciesusingadatabaseoforthologousgenomicsegments
AT wolfekennethh systematicdiscoveryofunannotatedgenesin11yeastspeciesusingadatabaseoforthologousgenomicsegments
_version_ 1725873510209814528