Validation of experimental charge-density refinement strategies: when do we overfit?

A cross-validation method is supplied to judge between various strategies in multipole refinement procedures. Its application enables straightforward detection of whether the refinement of additional parameters leads to an improvement in the model or an overfitting of the given data. For all tested...

Full description

Bibliographic Details
Main Authors: Lennard Krause, Benedikt Niepötter, Christian J. Schürmann, Dietmar Stalke, Regine Herbst-Irmer
Format: Article
Language:English
Published: International Union of Crystallography 2017-07-01
Series:IUCrJ
Subjects:
Online Access:http://scripts.iucr.org/cgi-bin/paper?S2052252517005103
id doaj-5deb95765a134fc98e33bc6e4a7a40ec
record_format Article
spelling doaj-5deb95765a134fc98e33bc6e4a7a40ec2020-11-24T23:41:40ZengInternational Union of CrystallographyIUCrJ2052-25252017-07-014442043010.1107/S2052252517005103lc5072Validation of experimental charge-density refinement strategies: when do we overfit?Lennard Krause0Benedikt Niepötter1Christian J. Schürmann2Dietmar Stalke3Regine Herbst-Irmer4Institut für Anorganische Chemie, Universität Göttingen, Tammannstraße 4, Göttingen 37077, GermanyInstitut für Anorganische Chemie, Universität Göttingen, Tammannstraße 4, Göttingen 37077, GermanyInstitut für Anorganische Chemie, Universität Göttingen, Tammannstraße 4, Göttingen 37077, GermanyInstitut für Anorganische Chemie, Universität Göttingen, Tammannstraße 4, Göttingen 37077, GermanyInstitut für Anorganische Chemie, Universität Göttingen, Tammannstraße 4, Göttingen 37077, GermanyA cross-validation method is supplied to judge between various strategies in multipole refinement procedures. Its application enables straightforward detection of whether the refinement of additional parameters leads to an improvement in the model or an overfitting of the given data. For all tested data sets it was possible to prove that the multipole parameters of atoms in comparable chemical environments should be constrained to be identical. In an automated approach, this method additionally delivers parameter distributions of k different refinements. These distributions can be used for further error diagnostics, e.g. to detect erroneously defined parameters or incorrectly determined reflections. Visualization tools show the variation in the parameters. These different refinements also provide rough estimates for the standard deviation of topological parameters.http://scripts.iucr.org/cgi-bin/paper?S2052252517005103charge densitycross-validationerror detectionRfreerefinement strategies
collection DOAJ
language English
format Article
sources DOAJ
author Lennard Krause
Benedikt Niepötter
Christian J. Schürmann
Dietmar Stalke
Regine Herbst-Irmer
spellingShingle Lennard Krause
Benedikt Niepötter
Christian J. Schürmann
Dietmar Stalke
Regine Herbst-Irmer
Validation of experimental charge-density refinement strategies: when do we overfit?
IUCrJ
charge density
cross-validation
error detection
Rfree
refinement strategies
author_facet Lennard Krause
Benedikt Niepötter
Christian J. Schürmann
Dietmar Stalke
Regine Herbst-Irmer
author_sort Lennard Krause
title Validation of experimental charge-density refinement strategies: when do we overfit?
title_short Validation of experimental charge-density refinement strategies: when do we overfit?
title_full Validation of experimental charge-density refinement strategies: when do we overfit?
title_fullStr Validation of experimental charge-density refinement strategies: when do we overfit?
title_full_unstemmed Validation of experimental charge-density refinement strategies: when do we overfit?
title_sort validation of experimental charge-density refinement strategies: when do we overfit?
publisher International Union of Crystallography
series IUCrJ
issn 2052-2525
publishDate 2017-07-01
description A cross-validation method is supplied to judge between various strategies in multipole refinement procedures. Its application enables straightforward detection of whether the refinement of additional parameters leads to an improvement in the model or an overfitting of the given data. For all tested data sets it was possible to prove that the multipole parameters of atoms in comparable chemical environments should be constrained to be identical. In an automated approach, this method additionally delivers parameter distributions of k different refinements. These distributions can be used for further error diagnostics, e.g. to detect erroneously defined parameters or incorrectly determined reflections. Visualization tools show the variation in the parameters. These different refinements also provide rough estimates for the standard deviation of topological parameters.
topic charge density
cross-validation
error detection
Rfree
refinement strategies
url http://scripts.iucr.org/cgi-bin/paper?S2052252517005103
work_keys_str_mv AT lennardkrause validationofexperimentalchargedensityrefinementstrategieswhendoweoverfit
AT benediktniepotter validationofexperimentalchargedensityrefinementstrategieswhendoweoverfit
AT christianjschurmann validationofexperimentalchargedensityrefinementstrategieswhendoweoverfit
AT dietmarstalke validationofexperimentalchargedensityrefinementstrategieswhendoweoverfit
AT regineherbstirmer validationofexperimentalchargedensityrefinementstrategieswhendoweoverfit
_version_ 1725506087496450048