A Balanced Approach to Adaptive Probability Density Estimation
Our development of a Fast (Mutual) Information Matching (FIM) of molecular dynamics time series data led us to the general problem of how to accurately estimate the probability density function of a random variable, especially in cases of very uneven samples. Here, we propose a novel Balanced Adapti...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2017-04-01
|
Series: | Frontiers in Molecular Biosciences |
Subjects: | |
Online Access: | http://journal.frontiersin.org/article/10.3389/fmolb.2017.00025/full |
id |
doaj-2efcac64ff874b25bf184eaa838e1834 |
---|---|
record_format |
Article |
spelling |
doaj-2efcac64ff874b25bf184eaa838e18342020-11-24T21:33:14ZengFrontiers Media S.A.Frontiers in Molecular Biosciences2296-889X2017-04-01410.3389/fmolb.2017.00025246007A Balanced Approach to Adaptive Probability Density EstimationJulio A. KovacsCailee HelmickWilly WriggersOur development of a Fast (Mutual) Information Matching (FIM) of molecular dynamics time series data led us to the general problem of how to accurately estimate the probability density function of a random variable, especially in cases of very uneven samples. Here, we propose a novel Balanced Adaptive Density Estimation (BADE) method that effectively optimizes the amount of smoothing at each point. To do this, BADE relies on an efficient nearest-neighbor search which results in good scaling for large data sizes. Our tests on simulated data show that BADE exhibits equal or better accuracy than existing methods, and visual tests on univariate and bivariate experimental data show that the results are also aesthetically pleasing. This is due in part to the use of a visual criterion for setting the smoothing level of the density estimate. Our results suggest that BADE offers an attractive new take on the fundamental density estimation problem in statistics. We have applied it on molecular dynamics simulations of membrane pore formation. We also expect BADE to be generally useful for low-dimensional applications in other statistical application domains such as bioinformatics, signal processing and econometrics.http://journal.frontiersin.org/article/10.3389/fmolb.2017.00025/fulladaptive density estimationcovariance ellipsoidcovariance smoothingoptimal number of nearest neighborsR*-treevisual criterion |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Julio A. Kovacs Cailee Helmick Willy Wriggers |
spellingShingle |
Julio A. Kovacs Cailee Helmick Willy Wriggers A Balanced Approach to Adaptive Probability Density Estimation Frontiers in Molecular Biosciences adaptive density estimation covariance ellipsoid covariance smoothing optimal number of nearest neighbors R*-tree visual criterion |
author_facet |
Julio A. Kovacs Cailee Helmick Willy Wriggers |
author_sort |
Julio A. Kovacs |
title |
A Balanced Approach to Adaptive Probability Density Estimation |
title_short |
A Balanced Approach to Adaptive Probability Density Estimation |
title_full |
A Balanced Approach to Adaptive Probability Density Estimation |
title_fullStr |
A Balanced Approach to Adaptive Probability Density Estimation |
title_full_unstemmed |
A Balanced Approach to Adaptive Probability Density Estimation |
title_sort |
balanced approach to adaptive probability density estimation |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Molecular Biosciences |
issn |
2296-889X |
publishDate |
2017-04-01 |
description |
Our development of a Fast (Mutual) Information Matching (FIM) of molecular dynamics time series data led us to the general problem of how to accurately estimate the probability density function of a random variable, especially in cases of very uneven samples. Here, we propose a novel Balanced Adaptive Density Estimation (BADE) method that effectively optimizes the amount of smoothing at each point. To do this, BADE relies on an efficient nearest-neighbor search which results in good scaling for large data sizes. Our tests on simulated data show that BADE exhibits equal or better accuracy than existing methods, and visual tests on univariate and bivariate experimental data show that the results are also aesthetically pleasing. This is due in part to the use of a visual criterion for setting the smoothing level of the density estimate. Our results suggest that BADE offers an attractive new take on the fundamental density estimation problem in statistics. We have applied it on molecular dynamics simulations of membrane pore formation. We also expect BADE to be generally useful for low-dimensional applications in other statistical application domains such as bioinformatics, signal processing and econometrics. |
topic |
adaptive density estimation covariance ellipsoid covariance smoothing optimal number of nearest neighbors R*-tree visual criterion |
url |
http://journal.frontiersin.org/article/10.3389/fmolb.2017.00025/full |
work_keys_str_mv |
AT julioakovacs abalancedapproachtoadaptiveprobabilitydensityestimation AT caileehelmick abalancedapproachtoadaptiveprobabilitydensityestimation AT willywriggers abalancedapproachtoadaptiveprobabilitydensityestimation AT julioakovacs balancedapproachtoadaptiveprobabilitydensityestimation AT caileehelmick balancedapproachtoadaptiveprobabilitydensityestimation AT willywriggers balancedapproachtoadaptiveprobabilitydensityestimation |
_version_ |
1725954061867417600 |