On Comparison of SimTandem with State-of-the-Art Peptide Identification Tools, Efficiency of Precursor Mass Filter and Dealing with Variable Modifications

The similarity search in theoretical mass spectra generated from protein sequence databases is a widely accepted approach for identification of peptides from query mass spectra produced by shotgun proteomics. Growing protein sequence databases and noisy query spectra demand database indexing techniq...

Full description

Bibliographic Details
Main Authors:	Novák Jiří, Hoksza David, Skopal Tomáš, Kohlbacher Oliver
Format:	Article
Language:	English
Published:	De Gruyter 2013-12-01
Series:	Journal of Integrative Bioinformatics
Online Access:	https://doi.org/10.1515/jib-2013-228

id	doaj-149e0c1c8dcd44b0832b03553dd5cd07
record_format	Article
spelling	doaj-149e0c1c8dcd44b0832b03553dd5cd072021-09-06T19:40:31ZengDe GruyterJournal of Integrative Bioinformatics1613-45162013-12-0110311510.1515/jib-2013-228biecoll-jib-2013-228On Comparison of SimTandem with State-of-the-Art Peptide Identification Tools, Efficiency of Precursor Mass Filter and Dealing with Variable ModificationsNovák Jiří0Hoksza David1Skopal Tomáš2Kohlbacher Oliver3Charles University in Prague, Faculty of Mathematics and Physics, SIRET Research Group, Malostranské nám. 25, 118 00 Prague, http://www.siret.cz, Czech RepublicCharles University in Prague, Faculty of Mathematics and Physics, SIRET Research Group, Malostranské nám. 25, 118 00 Prague, http://www.siret.cz, Czech RepublicCharles University in Prague, Faculty of Mathematics and Physics, SIRET Research Group, Malostranské nám. 25, 118 00 Prague, http://www.siret.cz, Czech RepublicEberhard-Karls-Universität Tübingen, Applied Bioinformatics Group, Sand 14, 72076 Tübingen, http://abi.inf.uni-tuebingen.de, GermanyThe similarity search in theoretical mass spectra generated from protein sequence databases is a widely accepted approach for identification of peptides from query mass spectra produced by shotgun proteomics. Growing protein sequence databases and noisy query spectra demand database indexing techniques and better similarity measures for the comparison of theoretical spectra against query spectra. We employ a modification of previously proposed parameterized Hausdorff distance for comparisons of mass spectra. The new distance outperforms the original distance, the angle distance and state-of-the-art peptide identification tools OMSSA and X!Tandem in the number of identified peptides even though the q-value is only 0.001. When a precursor mass filter is used as a database indexing technique, our method outperforms OMSSA in the speed of search. When variable modifications are not searched, the search time is similar to X!Tandem. We show that the precursor mass filter is an efficient database indexing technique for high-accuracy data even though many variable modifications are being searched. We demonstrate that the number of identified peptides is bigger when variable modifications are searched separately by more search runs of a peptide identification engine. Otherwise, the false discovery rates are affected by mixing unmodified and modified spectra together resulting in a lower number of identified peptides. Our method is implemented in the freely available application SimTandem which can be used in the framework TOPP based on OpenMS.https://doi.org/10.1515/jib-2013-228
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Novák Jiří Hoksza David Skopal Tomáš Kohlbacher Oliver
spellingShingle	Novák Jiří Hoksza David Skopal Tomáš Kohlbacher Oliver On Comparison of SimTandem with State-of-the-Art Peptide Identification Tools, Efficiency of Precursor Mass Filter and Dealing with Variable Modifications Journal of Integrative Bioinformatics
author_facet	Novák Jiří Hoksza David Skopal Tomáš Kohlbacher Oliver
author_sort	Novák Jiří
title	On Comparison of SimTandem with State-of-the-Art Peptide Identification Tools, Efficiency of Precursor Mass Filter and Dealing with Variable Modifications
title_short	On Comparison of SimTandem with State-of-the-Art Peptide Identification Tools, Efficiency of Precursor Mass Filter and Dealing with Variable Modifications
title_full	On Comparison of SimTandem with State-of-the-Art Peptide Identification Tools, Efficiency of Precursor Mass Filter and Dealing with Variable Modifications
title_fullStr	On Comparison of SimTandem with State-of-the-Art Peptide Identification Tools, Efficiency of Precursor Mass Filter and Dealing with Variable Modifications
title_full_unstemmed	On Comparison of SimTandem with State-of-the-Art Peptide Identification Tools, Efficiency of Precursor Mass Filter and Dealing with Variable Modifications
title_sort	on comparison of simtandem with state-of-the-art peptide identification tools, efficiency of precursor mass filter and dealing with variable modifications
publisher	De Gruyter
series	Journal of Integrative Bioinformatics
issn	1613-4516
publishDate	2013-12-01
description	The similarity search in theoretical mass spectra generated from protein sequence databases is a widely accepted approach for identification of peptides from query mass spectra produced by shotgun proteomics. Growing protein sequence databases and noisy query spectra demand database indexing techniques and better similarity measures for the comparison of theoretical spectra against query spectra. We employ a modification of previously proposed parameterized Hausdorff distance for comparisons of mass spectra. The new distance outperforms the original distance, the angle distance and state-of-the-art peptide identification tools OMSSA and X!Tandem in the number of identified peptides even though the q-value is only 0.001. When a precursor mass filter is used as a database indexing technique, our method outperforms OMSSA in the speed of search. When variable modifications are not searched, the search time is similar to X!Tandem. We show that the precursor mass filter is an efficient database indexing technique for high-accuracy data even though many variable modifications are being searched. We demonstrate that the number of identified peptides is bigger when variable modifications are searched separately by more search runs of a peptide identification engine. Otherwise, the false discovery rates are affected by mixing unmodified and modified spectra together resulting in a lower number of identified peptides. Our method is implemented in the freely available application SimTandem which can be used in the framework TOPP based on OpenMS.
url	https://doi.org/10.1515/jib-2013-228
work_keys_str_mv	AT novakjiri oncomparisonofsimtandemwithstateoftheartpeptideidentificationtoolsefficiencyofprecursormassfilteranddealingwithvariablemodifications AT hokszadavid oncomparisonofsimtandemwithstateoftheartpeptideidentificationtoolsefficiencyofprecursormassfilteranddealingwithvariablemodifications AT skopaltomas oncomparisonofsimtandemwithstateoftheartpeptideidentificationtoolsefficiencyofprecursormassfilteranddealingwithvariablemodifications AT kohlbacheroliver oncomparisonofsimtandemwithstateoftheartpeptideidentificationtoolsefficiencyofprecursormassfilteranddealingwithvariablemodifications
_version_	1717768321103822848

On Comparison of SimTandem with State-of-the-Art Peptide Identification Tools, Efficiency of Precursor Mass Filter and Dealing with Variable Modifications

Similar Items