InChI version 1.06: now more than 99.99% reliable

Abstract The software for the IUPAC Chemical Identifier, InChI, is extraordinarily reliable. It has been tested on large databases around the world, and has proved itself to be an essential tool in the handling and integration of large chemical databases. InChI version 1.05 was released in January 2...

Full description

Bibliographic Details
Main Authors: Jonathan M. Goodman, Igor Pletnev, Paul Thiessen, Evan Bolton, Stephen R. Heller
Format: Article
Language:English
Published: BMC 2021-05-01
Series:Journal of Cheminformatics
Subjects:
Online Access:https://doi.org/10.1186/s13321-021-00517-z
id doaj-05e85a7425aa491d9d31e08df2907068
record_format Article
spelling doaj-05e85a7425aa491d9d31e08df29070682021-05-30T11:44:24ZengBMCJournal of Cheminformatics1758-29462021-05-011311810.1186/s13321-021-00517-zInChI version 1.06: now more than 99.99% reliableJonathan M. Goodman0Igor Pletnev1Paul Thiessen2Evan Bolton3Stephen R. Heller4Centre for Molecular Informatics, Yusuf Hamied Department of ChemistryInChI TrustNational Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthNational Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthInChI TrustAbstract The software for the IUPAC Chemical Identifier, InChI, is extraordinarily reliable. It has been tested on large databases around the world, and has proved itself to be an essential tool in the handling and integration of large chemical databases. InChI version 1.05 was released in January 2017 and version 1.06 in December 2020. In this paper, we report on the current state of the InChI Software, the details of the improvements in the v1.06 release, and the results of a test of the InChI run on PubChem, a database of more than a hundred million molecules. The upgrade introduces significant new features, including support for pseudo-element atoms and an improved description of polymers. We expect that few, if any, applications using the standard InChI will need to change as a result of the changes in version 1.06. Numerical instability was discovered for 0.002% of this database, and a small number of other molecules were discovered for which the algorithm did not run smoothly. On the basis of PubChem data, we can demonstrate that InChI version 1.05 was 99.996% accurate, and InChI version 1.06 represents a step closer to perfection. Finally, we look forward to future releases and extensions for the InChI Chemical identifier.https://doi.org/10.1186/s13321-021-00517-zInChIInChIKeyPubChemRInChI
collection DOAJ
language English
format Article
sources DOAJ
author Jonathan M. Goodman
Igor Pletnev
Paul Thiessen
Evan Bolton
Stephen R. Heller
spellingShingle Jonathan M. Goodman
Igor Pletnev
Paul Thiessen
Evan Bolton
Stephen R. Heller
InChI version 1.06: now more than 99.99% reliable
Journal of Cheminformatics
InChI
InChIKey
PubChem
RInChI
author_facet Jonathan M. Goodman
Igor Pletnev
Paul Thiessen
Evan Bolton
Stephen R. Heller
author_sort Jonathan M. Goodman
title InChI version 1.06: now more than 99.99% reliable
title_short InChI version 1.06: now more than 99.99% reliable
title_full InChI version 1.06: now more than 99.99% reliable
title_fullStr InChI version 1.06: now more than 99.99% reliable
title_full_unstemmed InChI version 1.06: now more than 99.99% reliable
title_sort inchi version 1.06: now more than 99.99% reliable
publisher BMC
series Journal of Cheminformatics
issn 1758-2946
publishDate 2021-05-01
description Abstract The software for the IUPAC Chemical Identifier, InChI, is extraordinarily reliable. It has been tested on large databases around the world, and has proved itself to be an essential tool in the handling and integration of large chemical databases. InChI version 1.05 was released in January 2017 and version 1.06 in December 2020. In this paper, we report on the current state of the InChI Software, the details of the improvements in the v1.06 release, and the results of a test of the InChI run on PubChem, a database of more than a hundred million molecules. The upgrade introduces significant new features, including support for pseudo-element atoms and an improved description of polymers. We expect that few, if any, applications using the standard InChI will need to change as a result of the changes in version 1.06. Numerical instability was discovered for 0.002% of this database, and a small number of other molecules were discovered for which the algorithm did not run smoothly. On the basis of PubChem data, we can demonstrate that InChI version 1.05 was 99.996% accurate, and InChI version 1.06 represents a step closer to perfection. Finally, we look forward to future releases and extensions for the InChI Chemical identifier.
topic InChI
InChIKey
PubChem
RInChI
url https://doi.org/10.1186/s13321-021-00517-z
work_keys_str_mv AT jonathanmgoodman inchiversion106nowmorethan9999reliable
AT igorpletnev inchiversion106nowmorethan9999reliable
AT paulthiessen inchiversion106nowmorethan9999reliable
AT evanbolton inchiversion106nowmorethan9999reliable
AT stephenrheller inchiversion106nowmorethan9999reliable
_version_ 1721419997647994880