Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data

The local false discovery rate (LFDR) is one of many existing statistical methods that analyze multiple hypothesis testing. As a Bayesian quantity, the LFDR is based on the prior probability of the null hypothesis and a mixture distribution of null and non-null hypothesis. In practice, the LFDR is u...

Full description

Bibliographic Details
Main Author: Rahal, Abbas
Other Authors: Bickel, David
Format: Others
Language:en
Published: Université d'Ottawa / University of Ottawa 2021
Subjects:
Online Access:http://hdl.handle.net/10393/42408
http://dx.doi.org/10.20381/ruor-26628
Description
Summary:The local false discovery rate (LFDR) is one of many existing statistical methods that analyze multiple hypothesis testing. As a Bayesian quantity, the LFDR is based on the prior probability of the null hypothesis and a mixture distribution of null and non-null hypothesis. In practice, the LFDR is unknown and needs to be estimated. The empirical Bayes approach can be used to estimate that mixture distribution. Empirical Bayes does not require complete information about the prior and hyper prior distributions as in hierarchical Bayes. When we do not have enough information at the prior level, and instead of placing a distribution at the hyper prior level in the hierarchical Bayes model, empirical Bayes estimates the prior parameters using the data via, often, the marginal distribution. In this research, we developed new Bayesian methods under unknown prior distribution. A set of adequate prior distributions maybe defined using Bayesian model checking by setting a threshold on the posterior predictive p-value, prior predictive p-value, calibrated p-value, Bayes factor, or integrated likelihood. We derive a set of adequate posterior distributions from that set. In order to obtain a single posterior distribution instead of a set of adequate posterior distributions, we used a blended distribution, which minimizes the relative entropy of a set of adequate prior (or posterior) distributions to a "benchmark" prior (or posterior) distribution. We present two approaches to generate a blended posterior distribution, namely, updating-before-blending and blending-before-updating. The blended posterior distribution can be used to estimate the LFDR by considering the nonlocal false discovery rate as a benchmark and the different LFDR estimators as an adequate set. The likelihood ratio can often be misleading in multiple testing, unless it is supplemented by adjusted p-values or posterior probabilities based on sufficiently strong prior distributions. In case of unknown prior distributions, they can be estimated by empirical Bayes methods or blended distributions. We propose a general framework for applying the laws of likelihood to problems involving multiple hypotheses by bringing together multiple statistical models. We have applied the proposed framework to data sets from genomics, COVID-19 and other data.