Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data

The local false discovery rate (LFDR) is one of many existing statistical methods that analyze multiple hypothesis testing. As a Bayesian quantity, the LFDR is based on the prior probability of the null hypothesis and a mixture distribution of null and non-null hypothesis. In practice, the LFDR is u...

Full description

Bibliographic Details
Main Author:	Rahal, Abbas
Other Authors:	Bickel, David
Format:	Others
Language:	en
Published:	Université d'Ottawa / University of Ottawa 2021
Subjects:	Robust Bayesian statistics Imprecise probability Bayesian model checking Blended inference Posterior predictive p-value Local false discovery rate Empirical Bayes Multiple testing Bayesian false discovery rate Measure of evidence Direct likelihood inference Likelihoodism
Online Access:	http://hdl.handle.net/10393/42408 http://dx.doi.org/10.20381/ruor-26628

id	ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-42408
record_format	oai_dc
spelling	ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-424082021-07-16T05:22:36Z Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data Rahal, Abbas Bickel, David Robust Bayesian statistics Imprecise probability Bayesian model checking Blended inference Posterior predictive p-value Local false discovery rate Empirical Bayes Multiple testing Bayesian false discovery rate Measure of evidence Direct likelihood inference Likelihoodism The local false discovery rate (LFDR) is one of many existing statistical methods that analyze multiple hypothesis testing. As a Bayesian quantity, the LFDR is based on the prior probability of the null hypothesis and a mixture distribution of null and non-null hypothesis. In practice, the LFDR is unknown and needs to be estimated. The empirical Bayes approach can be used to estimate that mixture distribution. Empirical Bayes does not require complete information about the prior and hyper prior distributions as in hierarchical Bayes. When we do not have enough information at the prior level, and instead of placing a distribution at the hyper prior level in the hierarchical Bayes model, empirical Bayes estimates the prior parameters using the data via, often, the marginal distribution. In this research, we developed new Bayesian methods under unknown prior distribution. A set of adequate prior distributions maybe defined using Bayesian model checking by setting a threshold on the posterior predictive p-value, prior predictive p-value, calibrated p-value, Bayes factor, or integrated likelihood. We derive a set of adequate posterior distributions from that set. In order to obtain a single posterior distribution instead of a set of adequate posterior distributions, we used a blended distribution, which minimizes the relative entropy of a set of adequate prior (or posterior) distributions to a "benchmark" prior (or posterior) distribution. We present two approaches to generate a blended posterior distribution, namely, updating-before-blending and blending-before-updating. The blended posterior distribution can be used to estimate the LFDR by considering the nonlocal false discovery rate as a benchmark and the different LFDR estimators as an adequate set. The likelihood ratio can often be misleading in multiple testing, unless it is supplemented by adjusted p-values or posterior probabilities based on sufficiently strong prior distributions. In case of unknown prior distributions, they can be estimated by empirical Bayes methods or blended distributions. We propose a general framework for applying the laws of likelihood to problems involving multiple hypotheses by bringing together multiple statistical models. We have applied the proposed framework to data sets from genomics, COVID-19 and other data. 2021-07-14T19:03:49Z 2021-07-14T19:03:49Z 2021-07-14 Thesis http://hdl.handle.net/10393/42408 http://dx.doi.org/10.20381/ruor-26628 en application/pdf Université d'Ottawa / University of Ottawa
collection	NDLTD
language	en
format	Others
sources	NDLTD
topic	Robust Bayesian statistics Imprecise probability Bayesian model checking Blended inference Posterior predictive p-value Local false discovery rate Empirical Bayes Multiple testing Bayesian false discovery rate Measure of evidence Direct likelihood inference Likelihoodism
spellingShingle	Robust Bayesian statistics Imprecise probability Bayesian model checking Blended inference Posterior predictive p-value Local false discovery rate Empirical Bayes Multiple testing Bayesian false discovery rate Measure of evidence Direct likelihood inference Likelihoodism Rahal, Abbas Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
description	The local false discovery rate (LFDR) is one of many existing statistical methods that analyze multiple hypothesis testing. As a Bayesian quantity, the LFDR is based on the prior probability of the null hypothesis and a mixture distribution of null and non-null hypothesis. In practice, the LFDR is unknown and needs to be estimated. The empirical Bayes approach can be used to estimate that mixture distribution. Empirical Bayes does not require complete information about the prior and hyper prior distributions as in hierarchical Bayes. When we do not have enough information at the prior level, and instead of placing a distribution at the hyper prior level in the hierarchical Bayes model, empirical Bayes estimates the prior parameters using the data via, often, the marginal distribution. In this research, we developed new Bayesian methods under unknown prior distribution. A set of adequate prior distributions maybe defined using Bayesian model checking by setting a threshold on the posterior predictive p-value, prior predictive p-value, calibrated p-value, Bayes factor, or integrated likelihood. We derive a set of adequate posterior distributions from that set. In order to obtain a single posterior distribution instead of a set of adequate posterior distributions, we used a blended distribution, which minimizes the relative entropy of a set of adequate prior (or posterior) distributions to a "benchmark" prior (or posterior) distribution. We present two approaches to generate a blended posterior distribution, namely, updating-before-blending and blending-before-updating. The blended posterior distribution can be used to estimate the LFDR by considering the nonlocal false discovery rate as a benchmark and the different LFDR estimators as an adequate set. The likelihood ratio can often be misleading in multiple testing, unless it is supplemented by adjusted p-values or posterior probabilities based on sufficiently strong prior distributions. In case of unknown prior distributions, they can be estimated by empirical Bayes methods or blended distributions. We propose a general framework for applying the laws of likelihood to problems involving multiple hypotheses by bringing together multiple statistical models. We have applied the proposed framework to data sets from genomics, COVID-19 and other data.
author2	Bickel, David
author_facet	Bickel, David Rahal, Abbas
author	Rahal, Abbas
author_sort	Rahal, Abbas
title	Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
title_short	Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
title_full	Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
title_fullStr	Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
title_full_unstemmed	Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
title_sort	bayesian methods under unknown prior distributions with applications to the analysis of gene expression data
publisher	Université d'Ottawa / University of Ottawa
publishDate	2021
url	http://hdl.handle.net/10393/42408 http://dx.doi.org/10.20381/ruor-26628
work_keys_str_mv	AT rahalabbas bayesianmethodsunderunknownpriordistributionswithapplicationstotheanalysisofgeneexpressiondata
_version_	1719417169273421824

Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data

Similar Items