Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data

The local false discovery rate (LFDR) is one of many existing statistical methods that analyze multiple hypothesis testing. As a Bayesian quantity, the LFDR is based on the prior probability of the null hypothesis and a mixture distribution of null and non-null hypothesis. In practice, the LFDR is u...

Full description

Bibliographic Details
Main Author: Rahal, Abbas
Other Authors: Bickel, David
Format: Others
Language:en
Published: Université d'Ottawa / University of Ottawa 2021
Subjects:
Online Access:http://hdl.handle.net/10393/42408
http://dx.doi.org/10.20381/ruor-26628
id ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-42408
record_format oai_dc
spelling ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-424082021-07-16T05:22:36Z Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data Rahal, Abbas Bickel, David Robust Bayesian statistics Imprecise probability Bayesian model checking Blended inference Posterior predictive p-value Local false discovery rate Empirical Bayes Multiple testing Bayesian false discovery rate Measure of evidence Direct likelihood inference Likelihoodism The local false discovery rate (LFDR) is one of many existing statistical methods that analyze multiple hypothesis testing. As a Bayesian quantity, the LFDR is based on the prior probability of the null hypothesis and a mixture distribution of null and non-null hypothesis. In practice, the LFDR is unknown and needs to be estimated. The empirical Bayes approach can be used to estimate that mixture distribution. Empirical Bayes does not require complete information about the prior and hyper prior distributions as in hierarchical Bayes. When we do not have enough information at the prior level, and instead of placing a distribution at the hyper prior level in the hierarchical Bayes model, empirical Bayes estimates the prior parameters using the data via, often, the marginal distribution. In this research, we developed new Bayesian methods under unknown prior distribution. A set of adequate prior distributions maybe defined using Bayesian model checking by setting a threshold on the posterior predictive p-value, prior predictive p-value, calibrated p-value, Bayes factor, or integrated likelihood. We derive a set of adequate posterior distributions from that set. In order to obtain a single posterior distribution instead of a set of adequate posterior distributions, we used a blended distribution, which minimizes the relative entropy of a set of adequate prior (or posterior) distributions to a "benchmark" prior (or posterior) distribution. We present two approaches to generate a blended posterior distribution, namely, updating-before-blending and blending-before-updating. The blended posterior distribution can be used to estimate the LFDR by considering the nonlocal false discovery rate as a benchmark and the different LFDR estimators as an adequate set. The likelihood ratio can often be misleading in multiple testing, unless it is supplemented by adjusted p-values or posterior probabilities based on sufficiently strong prior distributions. In case of unknown prior distributions, they can be estimated by empirical Bayes methods or blended distributions. We propose a general framework for applying the laws of likelihood to problems involving multiple hypotheses by bringing together multiple statistical models. We have applied the proposed framework to data sets from genomics, COVID-19 and other data. 2021-07-14T19:03:49Z 2021-07-14T19:03:49Z 2021-07-14 Thesis http://hdl.handle.net/10393/42408 http://dx.doi.org/10.20381/ruor-26628 en application/pdf Université d'Ottawa / University of Ottawa
collection NDLTD
language en
format Others
sources NDLTD
topic Robust Bayesian statistics
Imprecise probability
Bayesian model checking
Blended inference
Posterior predictive p-value
Local false discovery rate
Empirical Bayes
Multiple testing
Bayesian false discovery rate
Measure of evidence
Direct likelihood inference
Likelihoodism
spellingShingle Robust Bayesian statistics
Imprecise probability
Bayesian model checking
Blended inference
Posterior predictive p-value
Local false discovery rate
Empirical Bayes
Multiple testing
Bayesian false discovery rate
Measure of evidence
Direct likelihood inference
Likelihoodism
Rahal, Abbas
Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
description The local false discovery rate (LFDR) is one of many existing statistical methods that analyze multiple hypothesis testing. As a Bayesian quantity, the LFDR is based on the prior probability of the null hypothesis and a mixture distribution of null and non-null hypothesis. In practice, the LFDR is unknown and needs to be estimated. The empirical Bayes approach can be used to estimate that mixture distribution. Empirical Bayes does not require complete information about the prior and hyper prior distributions as in hierarchical Bayes. When we do not have enough information at the prior level, and instead of placing a distribution at the hyper prior level in the hierarchical Bayes model, empirical Bayes estimates the prior parameters using the data via, often, the marginal distribution. In this research, we developed new Bayesian methods under unknown prior distribution. A set of adequate prior distributions maybe defined using Bayesian model checking by setting a threshold on the posterior predictive p-value, prior predictive p-value, calibrated p-value, Bayes factor, or integrated likelihood. We derive a set of adequate posterior distributions from that set. In order to obtain a single posterior distribution instead of a set of adequate posterior distributions, we used a blended distribution, which minimizes the relative entropy of a set of adequate prior (or posterior) distributions to a "benchmark" prior (or posterior) distribution. We present two approaches to generate a blended posterior distribution, namely, updating-before-blending and blending-before-updating. The blended posterior distribution can be used to estimate the LFDR by considering the nonlocal false discovery rate as a benchmark and the different LFDR estimators as an adequate set. The likelihood ratio can often be misleading in multiple testing, unless it is supplemented by adjusted p-values or posterior probabilities based on sufficiently strong prior distributions. In case of unknown prior distributions, they can be estimated by empirical Bayes methods or blended distributions. We propose a general framework for applying the laws of likelihood to problems involving multiple hypotheses by bringing together multiple statistical models. We have applied the proposed framework to data sets from genomics, COVID-19 and other data.
author2 Bickel, David
author_facet Bickel, David
Rahal, Abbas
author Rahal, Abbas
author_sort Rahal, Abbas
title Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
title_short Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
title_full Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
title_fullStr Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
title_full_unstemmed Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data
title_sort bayesian methods under unknown prior distributions with applications to the analysis of gene expression data
publisher Université d'Ottawa / University of Ottawa
publishDate 2021
url http://hdl.handle.net/10393/42408
http://dx.doi.org/10.20381/ruor-26628
work_keys_str_mv AT rahalabbas bayesianmethodsunderunknownpriordistributionswithapplicationstotheanalysisofgeneexpressiondata
_version_ 1719417169273421824