The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching
Abstract Background The Chemistry Development Kit (CDK) is a widely used open source cheminformatics toolkit, providing data structures to represent chemical concepts along with methods to manipulate such structures and perform computations on them. The library implements a wide variety of cheminfor...
Main Authors: | , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2017-06-01
|
Series: | Journal of Cheminformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s13321-017-0220-4 |
id |
doaj-04d9c1e269724925b7979d21a9dd3f92 |
---|---|
record_format |
Article |
spelling |
doaj-04d9c1e269724925b7979d21a9dd3f922020-11-25T00:42:45ZengBMCJournal of Cheminformatics1758-29462017-06-019111910.1186/s13321-017-0220-4The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searchingEgon L. Willighagen0John W. Mayfield1Jonathan Alvarsson2Arvid Berg3Lars Carlsson4Nina Jeliazkova5Stefan Kuhn6Tomáš Pluskal7Miquel Rojas-Chertó8Ola Spjuth9Gilleain TorranceChris T. Evelo10Rajarshi Guha11Christoph Steinbeck12Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht UniversityNextMove Software LtdDepartment of Pharmaceutical Biosciences, Uppsala UniversityDepartment of Pharmaceutical Biosciences, Uppsala UniversityAstraZeneca, Innovative Medicines & Early Development, Quantitative BiologyIdeaconsult LtdDepartment of Informatics, University of LeicesterWhitehead Institute for Biomedical ResearchQuímica Clínica AplicadaDepartment of Pharmaceutical Biosciences, Uppsala UniversityDepartment of Bioinformatics - BiGCaT, NUTRIM, Maastricht UniversityNational Center for Advancing Translational SciencesInstitute for Inorganic and Analytical Chemistry, Friedrich-Schiller-UniversityAbstract Background The Chemistry Development Kit (CDK) is a widely used open source cheminformatics toolkit, providing data structures to represent chemical concepts along with methods to manipulate such structures and perform computations on them. The library implements a wide variety of cheminformatics algorithms ranging from chemical structure canonicalization to molecular descriptor calculations and pharmacophore perception. It is used in drug discovery, metabolomics, and toxicology. Over the last 10 years, the code base has grown significantly, however, resulting in many complex interdependencies among components and poor performance of many algorithms. Results We report improvements to the CDK v2.0 since the v1.2 release series, specifically addressing the increased functional complexity and poor performance. We first summarize the addition of new functionality, such atom typing and molecular formula handling, and improvement to existing functionality that has led to significantly better performance for substructure searching, molecular fingerprints, and rendering of molecules. Second, we outline how the CDK has evolved with respect to quality control and the approaches we have adopted to ensure stability, including a code review mechanism. Conclusions This paper highlights our continued efforts to provide a community driven, open source cheminformatics library, and shows that such collaborative projects can thrive over extended periods of time, resulting in a high-quality and performant library. By taking advantage of community support and contributions, we show that an open source cheminformatics project can act as a peer reviewed publishing platform for scientific computing software. Graphical abstract CDK 2.0 provides new features and improved performancehttp://link.springer.com/article/10.1186/s13321-017-0220-4JavaCheminformaticsBioinformaticsMetabolomicsDepiction |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Egon L. Willighagen John W. Mayfield Jonathan Alvarsson Arvid Berg Lars Carlsson Nina Jeliazkova Stefan Kuhn Tomáš Pluskal Miquel Rojas-Chertó Ola Spjuth Gilleain Torrance Chris T. Evelo Rajarshi Guha Christoph Steinbeck |
spellingShingle |
Egon L. Willighagen John W. Mayfield Jonathan Alvarsson Arvid Berg Lars Carlsson Nina Jeliazkova Stefan Kuhn Tomáš Pluskal Miquel Rojas-Chertó Ola Spjuth Gilleain Torrance Chris T. Evelo Rajarshi Guha Christoph Steinbeck The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching Journal of Cheminformatics Java Cheminformatics Bioinformatics Metabolomics Depiction |
author_facet |
Egon L. Willighagen John W. Mayfield Jonathan Alvarsson Arvid Berg Lars Carlsson Nina Jeliazkova Stefan Kuhn Tomáš Pluskal Miquel Rojas-Chertó Ola Spjuth Gilleain Torrance Chris T. Evelo Rajarshi Guha Christoph Steinbeck |
author_sort |
Egon L. Willighagen |
title |
The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching |
title_short |
The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching |
title_full |
The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching |
title_fullStr |
The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching |
title_full_unstemmed |
The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching |
title_sort |
chemistry development kit (cdk) v2.0: atom typing, depiction, molecular formulas, and substructure searching |
publisher |
BMC |
series |
Journal of Cheminformatics |
issn |
1758-2946 |
publishDate |
2017-06-01 |
description |
Abstract Background The Chemistry Development Kit (CDK) is a widely used open source cheminformatics toolkit, providing data structures to represent chemical concepts along with methods to manipulate such structures and perform computations on them. The library implements a wide variety of cheminformatics algorithms ranging from chemical structure canonicalization to molecular descriptor calculations and pharmacophore perception. It is used in drug discovery, metabolomics, and toxicology. Over the last 10 years, the code base has grown significantly, however, resulting in many complex interdependencies among components and poor performance of many algorithms. Results We report improvements to the CDK v2.0 since the v1.2 release series, specifically addressing the increased functional complexity and poor performance. We first summarize the addition of new functionality, such atom typing and molecular formula handling, and improvement to existing functionality that has led to significantly better performance for substructure searching, molecular fingerprints, and rendering of molecules. Second, we outline how the CDK has evolved with respect to quality control and the approaches we have adopted to ensure stability, including a code review mechanism. Conclusions This paper highlights our continued efforts to provide a community driven, open source cheminformatics library, and shows that such collaborative projects can thrive over extended periods of time, resulting in a high-quality and performant library. By taking advantage of community support and contributions, we show that an open source cheminformatics project can act as a peer reviewed publishing platform for scientific computing software. Graphical abstract CDK 2.0 provides new features and improved performance |
topic |
Java Cheminformatics Bioinformatics Metabolomics Depiction |
url |
http://link.springer.com/article/10.1186/s13321-017-0220-4 |
work_keys_str_mv |
AT egonlwillighagen thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT johnwmayfield thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT jonathanalvarsson thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT arvidberg thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT larscarlsson thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT ninajeliazkova thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT stefankuhn thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT tomaspluskal thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT miquelrojascherto thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT olaspjuth thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT gilleaintorrance thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT christevelo thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT rajarshiguha thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT christophsteinbeck thechemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT egonlwillighagen chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT johnwmayfield chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT jonathanalvarsson chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT arvidberg chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT larscarlsson chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT ninajeliazkova chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT stefankuhn chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT tomaspluskal chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT miquelrojascherto chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT olaspjuth chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT gilleaintorrance chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT christevelo chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT rajarshiguha chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching AT christophsteinbeck chemistrydevelopmentkitcdkv20atomtypingdepictionmolecularformulasandsubstructuresearching |
_version_ |
1725280469138800640 |