A hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classification
Abstract Background Mixtures of beta distributions are a flexible tool for modeling data with values on the unit interval, such as methylation levels. However, maximum likelihood parameter estimation with beta distributions suffers from problems because of singularities in the log-likelihood functio...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2017-08-01
|
Series: | Algorithms for Molecular Biology |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s13015-017-0112-1 |
id |
doaj-82da8ea0d5a64a62abaaf450e61e11a0 |
---|---|
record_format |
Article |
spelling |
doaj-82da8ea0d5a64a62abaaf450e61e11a02020-11-24T21:01:23ZengBMCAlgorithms for Molecular Biology1748-71882017-08-0112111210.1186/s13015-017-0112-1A hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classificationChristopher Schröder0Sven Rahmann1Genome Informatics, Institute of Human Genetics, University of Duisburg-Essen, University Hospital EssenGenome Informatics, Institute of Human Genetics, University of Duisburg-Essen, University Hospital EssenAbstract Background Mixtures of beta distributions are a flexible tool for modeling data with values on the unit interval, such as methylation levels. However, maximum likelihood parameter estimation with beta distributions suffers from problems because of singularities in the log-likelihood function if some observations take the values 0 or 1. Methods While ad-hoc corrections have been proposed to mitigate this problem, we propose a different approach to parameter estimation for beta mixtures where such problems do not arise in the first place. Our algorithm combines latent variables with the method of moments instead of maximum likelihood, which has computational advantages over the popular EM algorithm. Results As an application, we demonstrate that methylation state classification is more accurate when using adaptive thresholds from beta mixtures than non-adaptive thresholds on observed methylation levels. We also demonstrate that we can accurately infer the number of mixture components. Conclusions The hybrid algorithm between likelihood-based component un-mixing and moment-based parameter estimation is a robust and efficient method for beta mixture estimation. We provide an implementation of the method (“betamix”) as open source software under the MIT license.http://link.springer.com/article/10.1186/s13015-017-0112-1Mixture modelBeta distributionMaximum likelihoodMethod of momentsEM algorithmDifferential methylation |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Christopher Schröder Sven Rahmann |
spellingShingle |
Christopher Schröder Sven Rahmann A hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classification Algorithms for Molecular Biology Mixture model Beta distribution Maximum likelihood Method of moments EM algorithm Differential methylation |
author_facet |
Christopher Schröder Sven Rahmann |
author_sort |
Christopher Schröder |
title |
A hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classification |
title_short |
A hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classification |
title_full |
A hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classification |
title_fullStr |
A hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classification |
title_full_unstemmed |
A hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classification |
title_sort |
hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classification |
publisher |
BMC |
series |
Algorithms for Molecular Biology |
issn |
1748-7188 |
publishDate |
2017-08-01 |
description |
Abstract Background Mixtures of beta distributions are a flexible tool for modeling data with values on the unit interval, such as methylation levels. However, maximum likelihood parameter estimation with beta distributions suffers from problems because of singularities in the log-likelihood function if some observations take the values 0 or 1. Methods While ad-hoc corrections have been proposed to mitigate this problem, we propose a different approach to parameter estimation for beta mixtures where such problems do not arise in the first place. Our algorithm combines latent variables with the method of moments instead of maximum likelihood, which has computational advantages over the popular EM algorithm. Results As an application, we demonstrate that methylation state classification is more accurate when using adaptive thresholds from beta mixtures than non-adaptive thresholds on observed methylation levels. We also demonstrate that we can accurately infer the number of mixture components. Conclusions The hybrid algorithm between likelihood-based component un-mixing and moment-based parameter estimation is a robust and efficient method for beta mixture estimation. We provide an implementation of the method (“betamix”) as open source software under the MIT license. |
topic |
Mixture model Beta distribution Maximum likelihood Method of moments EM algorithm Differential methylation |
url |
http://link.springer.com/article/10.1186/s13015-017-0112-1 |
work_keys_str_mv |
AT christopherschroder ahybridparameterestimationalgorithmforbetamixturesandapplicationstomethylationstateclassification AT svenrahmann ahybridparameterestimationalgorithmforbetamixturesandapplicationstomethylationstateclassification AT christopherschroder hybridparameterestimationalgorithmforbetamixturesandapplicationstomethylationstateclassification AT svenrahmann hybridparameterestimationalgorithmforbetamixturesandapplicationstomethylationstateclassification |
_version_ |
1716778192462675968 |