The importance of recognizing and reporting sequence database contamination for proteomics
Advances in genome sequencing have made proteomic experiments more successful than ever. However, not all entries in a sequence database are of equal quality. Genome sequences are contaminated more frequently than is admitted. Contamination impacts homology-based proteomic, proteogenomic, and metapr...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2014-06-01
|
Series: | EuPA Open Proteomics |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2212968514000269 |
id |
doaj-acd2171caf0542aaaf9f7ab5c7210ede |
---|---|
record_format |
Article |
spelling |
doaj-acd2171caf0542aaaf9f7ab5c7210ede2020-11-24T22:35:23ZengElsevierEuPA Open Proteomics2212-96852014-06-013C24624910.1016/j.euprot.2014.04.001The importance of recognizing and reporting sequence database contamination for proteomicsOlivier PibleErica M. HartmannGilles ImbertJean ArmengaudAdvances in genome sequencing have made proteomic experiments more successful than ever. However, not all entries in a sequence database are of equal quality. Genome sequences are contaminated more frequently than is admitted. Contamination impacts homology-based proteomic, proteogenomic, and metaproteomic results. We highlight two examples in the National Center for Biotechnology Information non-redundant database (NCBInr) that are likely contaminated: the bacterium Enterococcus gallinarum EGD-AAK12 and the insect Ceratitis capitata. We hope to incite users of this and other databases to critically evaluate submitted sequences and to contribute to the overall quality of the database by signaling potential errors when possible.http://www.sciencedirect.com/science/article/pii/S2212968514000269DatabaseProteomicsMetaproteomicsContaminationBlast analysisCuration |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Olivier Pible Erica M. Hartmann Gilles Imbert Jean Armengaud |
spellingShingle |
Olivier Pible Erica M. Hartmann Gilles Imbert Jean Armengaud The importance of recognizing and reporting sequence database contamination for proteomics EuPA Open Proteomics Database Proteomics Metaproteomics Contamination Blast analysis Curation |
author_facet |
Olivier Pible Erica M. Hartmann Gilles Imbert Jean Armengaud |
author_sort |
Olivier Pible |
title |
The importance of recognizing and reporting sequence database contamination for proteomics |
title_short |
The importance of recognizing and reporting sequence database contamination for proteomics |
title_full |
The importance of recognizing and reporting sequence database contamination for proteomics |
title_fullStr |
The importance of recognizing and reporting sequence database contamination for proteomics |
title_full_unstemmed |
The importance of recognizing and reporting sequence database contamination for proteomics |
title_sort |
importance of recognizing and reporting sequence database contamination for proteomics |
publisher |
Elsevier |
series |
EuPA Open Proteomics |
issn |
2212-9685 |
publishDate |
2014-06-01 |
description |
Advances in genome sequencing have made proteomic experiments more successful than ever. However, not all entries in a sequence database are of equal quality. Genome sequences are contaminated more frequently than is admitted. Contamination impacts homology-based proteomic, proteogenomic, and metaproteomic results. We highlight two examples in the National Center for Biotechnology Information non-redundant database (NCBInr) that are likely contaminated: the bacterium Enterococcus gallinarum EGD-AAK12 and the insect Ceratitis capitata. We hope to incite users of this and other databases to critically evaluate submitted sequences and to contribute to the overall quality of the database by signaling potential errors when possible. |
topic |
Database Proteomics Metaproteomics Contamination Blast analysis Curation |
url |
http://www.sciencedirect.com/science/article/pii/S2212968514000269 |
work_keys_str_mv |
AT olivierpible theimportanceofrecognizingandreportingsequencedatabasecontaminationforproteomics AT ericamhartmann theimportanceofrecognizingandreportingsequencedatabasecontaminationforproteomics AT gillesimbert theimportanceofrecognizingandreportingsequencedatabasecontaminationforproteomics AT jeanarmengaud theimportanceofrecognizingandreportingsequencedatabasecontaminationforproteomics AT olivierpible importanceofrecognizingandreportingsequencedatabasecontaminationforproteomics AT ericamhartmann importanceofrecognizingandreportingsequencedatabasecontaminationforproteomics AT gillesimbert importanceofrecognizingandreportingsequencedatabasecontaminationforproteomics AT jeanarmengaud importanceofrecognizingandreportingsequencedatabasecontaminationforproteomics |
_version_ |
1725723596965281792 |