RNRdb, a curated database of the universal enzyme family ribonucleotide reductase, reveals a high level of misannotation in sequences deposited to Genbank

<p>Abstract</p> <p>Background</p> <p>Ribonucleotide reductases (RNRs) catalyse the only known de novo pathway for deoxyribonucleotide synthesis, and are therefore essential to DNA-based life. While ribonucleotide reduction has a single evolutionary origin, significant d...

Full description

Bibliographic Details
Main Authors: Poole Anthony M, Torrents Eduard, Lundin Daniel, Sjöberg Britt-Marie
Format: Article
Language:English
Published: BMC 2009-12-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/10/589
id doaj-cb17bf370a4849209741f14ecfe1fe29
record_format Article
spelling doaj-cb17bf370a4849209741f14ecfe1fe292020-11-25T00:37:53ZengBMCBMC Genomics1471-21642009-12-0110158910.1186/1471-2164-10-589RNRdb, a curated database of the universal enzyme family ribonucleotide reductase, reveals a high level of misannotation in sequences deposited to GenbankPoole Anthony MTorrents EduardLundin DanielSjöberg Britt-Marie<p>Abstract</p> <p>Background</p> <p>Ribonucleotide reductases (RNRs) catalyse the only known de novo pathway for deoxyribonucleotide synthesis, and are therefore essential to DNA-based life. While ribonucleotide reduction has a single evolutionary origin, significant differences between RNRs nevertheless exist, notably in cofactor requirements, subunit composition and allosteric regulation. These differences result in distinct operational constraints (anaerobicity, iron/oxygen dependence and cobalamin dependence), and form the basis for the classification of RNRs into three classes.</p> <p>Description</p> <p>In RNRdb (Ribonucleotide Reductase database), we have collated and curated all known RNR protein sequences with the aim of providing a resource for exploration of RNR diversity and distribution. By comparing expert manual annotations with annotations stored in Genbank, we find that significant inaccuracies exist in larger databases. To our surprise, only 23% of protein sequences included in RNRdb are correctly annotated across the key attributes of class, role and function, with 17% being incorrectly annotated across all three categories. This illustrates the utility of specialist databases for applications where a high degree of annotation accuracy may be important. The database houses information on annotation, distribution and diversity of RNRs, and links to solved RNR structures, and can be searched through a BLAST interface. RNRdb is accessible through a public web interface at <url>http://rnrdb.molbio.su.se</url>.</p> <p>Conclusion</p> <p>RNRdb is a specialist database that provides a reliable annotation and classification resource for RNR proteins, as well as a tool to explore distribution patterns of RNR classes. The recent expansion in available genome sequence data have provided us with a picture of RNR distribution that is more complex than believed only a few years ago; our database indicates that RNRs of all three classes are found across all three cellular domains. Moreover, we find a number of organisms that encode all three classes.</p> http://www.biomedcentral.com/1471-2164/10/589
collection DOAJ
language English
format Article
sources DOAJ
author Poole Anthony M
Torrents Eduard
Lundin Daniel
Sjöberg Britt-Marie
spellingShingle Poole Anthony M
Torrents Eduard
Lundin Daniel
Sjöberg Britt-Marie
RNRdb, a curated database of the universal enzyme family ribonucleotide reductase, reveals a high level of misannotation in sequences deposited to Genbank
BMC Genomics
author_facet Poole Anthony M
Torrents Eduard
Lundin Daniel
Sjöberg Britt-Marie
author_sort Poole Anthony M
title RNRdb, a curated database of the universal enzyme family ribonucleotide reductase, reveals a high level of misannotation in sequences deposited to Genbank
title_short RNRdb, a curated database of the universal enzyme family ribonucleotide reductase, reveals a high level of misannotation in sequences deposited to Genbank
title_full RNRdb, a curated database of the universal enzyme family ribonucleotide reductase, reveals a high level of misannotation in sequences deposited to Genbank
title_fullStr RNRdb, a curated database of the universal enzyme family ribonucleotide reductase, reveals a high level of misannotation in sequences deposited to Genbank
title_full_unstemmed RNRdb, a curated database of the universal enzyme family ribonucleotide reductase, reveals a high level of misannotation in sequences deposited to Genbank
title_sort rnrdb, a curated database of the universal enzyme family ribonucleotide reductase, reveals a high level of misannotation in sequences deposited to genbank
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2009-12-01
description <p>Abstract</p> <p>Background</p> <p>Ribonucleotide reductases (RNRs) catalyse the only known de novo pathway for deoxyribonucleotide synthesis, and are therefore essential to DNA-based life. While ribonucleotide reduction has a single evolutionary origin, significant differences between RNRs nevertheless exist, notably in cofactor requirements, subunit composition and allosteric regulation. These differences result in distinct operational constraints (anaerobicity, iron/oxygen dependence and cobalamin dependence), and form the basis for the classification of RNRs into three classes.</p> <p>Description</p> <p>In RNRdb (Ribonucleotide Reductase database), we have collated and curated all known RNR protein sequences with the aim of providing a resource for exploration of RNR diversity and distribution. By comparing expert manual annotations with annotations stored in Genbank, we find that significant inaccuracies exist in larger databases. To our surprise, only 23% of protein sequences included in RNRdb are correctly annotated across the key attributes of class, role and function, with 17% being incorrectly annotated across all three categories. This illustrates the utility of specialist databases for applications where a high degree of annotation accuracy may be important. The database houses information on annotation, distribution and diversity of RNRs, and links to solved RNR structures, and can be searched through a BLAST interface. RNRdb is accessible through a public web interface at <url>http://rnrdb.molbio.su.se</url>.</p> <p>Conclusion</p> <p>RNRdb is a specialist database that provides a reliable annotation and classification resource for RNR proteins, as well as a tool to explore distribution patterns of RNR classes. The recent expansion in available genome sequence data have provided us with a picture of RNR distribution that is more complex than believed only a few years ago; our database indicates that RNRs of all three classes are found across all three cellular domains. Moreover, we find a number of organisms that encode all three classes.</p>
url http://www.biomedcentral.com/1471-2164/10/589
work_keys_str_mv AT pooleanthonym rnrdbacurateddatabaseoftheuniversalenzymefamilyribonucleotidereductaserevealsahighlevelofmisannotationinsequencesdepositedtogenbank
AT torrentseduard rnrdbacurateddatabaseoftheuniversalenzymefamilyribonucleotidereductaserevealsahighlevelofmisannotationinsequencesdepositedtogenbank
AT lundindaniel rnrdbacurateddatabaseoftheuniversalenzymefamilyribonucleotidereductaserevealsahighlevelofmisannotationinsequencesdepositedtogenbank
AT sjobergbrittmarie rnrdbacurateddatabaseoftheuniversalenzymefamilyribonucleotidereductaserevealsahighlevelofmisannotationinsequencesdepositedtogenbank
_version_ 1725299194012368896