A Balanced Approach to Adaptive Probability Density Estimation

Our development of a Fast (Mutual) Information Matching (FIM) of molecular dynamics time series data led us to the general problem of how to accurately estimate the probability density function of a random variable, especially in cases of very uneven samples. Here, we propose a novel Balanced Adapti...

Full description

Bibliographic Details
Main Authors: Julio A. Kovacs, Cailee Helmick, Willy Wriggers
Format: Article
Language:English
Published: Frontiers Media S.A. 2017-04-01
Series:Frontiers in Molecular Biosciences
Subjects:
Online Access:http://journal.frontiersin.org/article/10.3389/fmolb.2017.00025/full
id doaj-2efcac64ff874b25bf184eaa838e1834
record_format Article
spelling doaj-2efcac64ff874b25bf184eaa838e18342020-11-24T21:33:14ZengFrontiers Media S.A.Frontiers in Molecular Biosciences2296-889X2017-04-01410.3389/fmolb.2017.00025246007A Balanced Approach to Adaptive Probability Density EstimationJulio A. KovacsCailee HelmickWilly WriggersOur development of a Fast (Mutual) Information Matching (FIM) of molecular dynamics time series data led us to the general problem of how to accurately estimate the probability density function of a random variable, especially in cases of very uneven samples. Here, we propose a novel Balanced Adaptive Density Estimation (BADE) method that effectively optimizes the amount of smoothing at each point. To do this, BADE relies on an efficient nearest-neighbor search which results in good scaling for large data sizes. Our tests on simulated data show that BADE exhibits equal or better accuracy than existing methods, and visual tests on univariate and bivariate experimental data show that the results are also aesthetically pleasing. This is due in part to the use of a visual criterion for setting the smoothing level of the density estimate. Our results suggest that BADE offers an attractive new take on the fundamental density estimation problem in statistics. We have applied it on molecular dynamics simulations of membrane pore formation. We also expect BADE to be generally useful for low-dimensional applications in other statistical application domains such as bioinformatics, signal processing and econometrics.http://journal.frontiersin.org/article/10.3389/fmolb.2017.00025/fulladaptive density estimationcovariance ellipsoidcovariance smoothingoptimal number of nearest neighborsR*-treevisual criterion
collection DOAJ
language English
format Article
sources DOAJ
author Julio A. Kovacs
Cailee Helmick
Willy Wriggers
spellingShingle Julio A. Kovacs
Cailee Helmick
Willy Wriggers
A Balanced Approach to Adaptive Probability Density Estimation
Frontiers in Molecular Biosciences
adaptive density estimation
covariance ellipsoid
covariance smoothing
optimal number of nearest neighbors
R*-tree
visual criterion
author_facet Julio A. Kovacs
Cailee Helmick
Willy Wriggers
author_sort Julio A. Kovacs
title A Balanced Approach to Adaptive Probability Density Estimation
title_short A Balanced Approach to Adaptive Probability Density Estimation
title_full A Balanced Approach to Adaptive Probability Density Estimation
title_fullStr A Balanced Approach to Adaptive Probability Density Estimation
title_full_unstemmed A Balanced Approach to Adaptive Probability Density Estimation
title_sort balanced approach to adaptive probability density estimation
publisher Frontiers Media S.A.
series Frontiers in Molecular Biosciences
issn 2296-889X
publishDate 2017-04-01
description Our development of a Fast (Mutual) Information Matching (FIM) of molecular dynamics time series data led us to the general problem of how to accurately estimate the probability density function of a random variable, especially in cases of very uneven samples. Here, we propose a novel Balanced Adaptive Density Estimation (BADE) method that effectively optimizes the amount of smoothing at each point. To do this, BADE relies on an efficient nearest-neighbor search which results in good scaling for large data sizes. Our tests on simulated data show that BADE exhibits equal or better accuracy than existing methods, and visual tests on univariate and bivariate experimental data show that the results are also aesthetically pleasing. This is due in part to the use of a visual criterion for setting the smoothing level of the density estimate. Our results suggest that BADE offers an attractive new take on the fundamental density estimation problem in statistics. We have applied it on molecular dynamics simulations of membrane pore formation. We also expect BADE to be generally useful for low-dimensional applications in other statistical application domains such as bioinformatics, signal processing and econometrics.
topic adaptive density estimation
covariance ellipsoid
covariance smoothing
optimal number of nearest neighbors
R*-tree
visual criterion
url http://journal.frontiersin.org/article/10.3389/fmolb.2017.00025/full
work_keys_str_mv AT julioakovacs abalancedapproachtoadaptiveprobabilitydensityestimation
AT caileehelmick abalancedapproachtoadaptiveprobabilitydensityestimation
AT willywriggers abalancedapproachtoadaptiveprobabilitydensityestimation
AT julioakovacs balancedapproachtoadaptiveprobabilitydensityestimation
AT caileehelmick balancedapproachtoadaptiveprobabilitydensityestimation
AT willywriggers balancedapproachtoadaptiveprobabilitydensityestimation
_version_ 1725954061867417600