Bayesian learning framework with kernel-imbedded Gaussian processes applied to microarray analysis

Thesis (Ph.D.)--University of Hawaii at Manoa, 2008. === DNA microarray technology has provided researchers a high-throughput means to simultaneously measure expression levels for thousands of genes in an experiment. With a probit regression setting and assuming that the link function between signif...

Full description

Bibliographic Details
Main Author: Zhao, Xin
Language:en-US
Published: 2011
Online Access:http://hdl.handle.net/10125/20510
id ndltd-UHAWAII-oai-scholarspace.manoa.hawaii.edu-10125-20510
record_format oai_dc
spelling ndltd-UHAWAII-oai-scholarspace.manoa.hawaii.edu-10125-205102013-01-08T11:15:33ZBayesian learning framework with kernel-imbedded Gaussian processes applied to microarray analysisZhao, XinThesis (Ph.D.)--University of Hawaii at Manoa, 2008.DNA microarray technology has provided researchers a high-throughput means to simultaneously measure expression levels for thousands of genes in an experiment. With a probit regression setting and assuming that the link function between significant gene expression data and latent variable for the response label is a Gaussian process, a kernel-induced hierarchical Bayesian framework is built for a cancer classification problem by using microarray gene expression data.In summary, built on a Gaussian process model, a kernel-induced hierarchical Bayesian framework using microarray gene expression data for a cancer multi-classification problem is presented in this study. Our main contribution is a fully automated learning algorithm to solve this Bayesian model. Satisfactory results have been achieved in both the simulated examples and the real-world data studies.Six published microarray datasets were analyzed in this study. The results show that predictive performance of our method for all these datasets is better than or at least as good as that of other state-of-the-art microarray analysis methods. Our method especially shows its superiority in analyzing one dataset that contains multiple suspicious mislabeled samples. For each of these datasets, we identified a set of significant genes, which can be used for further biological inspection at genome level.Targeting a multi-classification problem and adopting a variable selection approach with a Gibbs sample as core, we developed the algorithm, kernel-imbedded Gaussian Process (KIGP), to analyze microarray data under a Bayesian framework. Through a feature projection procedure and using a univariate ranking scheme as gene-selection strategy, we further designed an alternative microarray analysis model, natural kernel-imbedded Gaussian Process (NKIGP). In the end, embedded with a reversible jump Markov chain Monte Carlo (RJMCMC) model, we present an efficient algorithm with a cascading structure to unify the proposed methods of this study.The simulated examples demonstrate that, our method performs almost always close to the Bayesian bound in both the cases with linear Bayesian classifiers and the cases with very non-linear Bayesian classifiers. Even with mislabeled training samples, our method is still robust, showing its broad usability to those microarray analysis problems that linear methods may work flakily.Includes bibliographical references (leaves xxx-xxx).Also available by subscription via World Wide Web179 leaves, bound 29 cm2011-07-21T23:06:44Z2011-07-21T23:06:44Z2008ThesisText9780549787891http://hdl.handle.net/10125/20510en-USTheses for the degree of Doctor of Philosophy (University of Hawaii at Manoa) no. XXXAll UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner.
collection NDLTD
language en-US
sources NDLTD
description Thesis (Ph.D.)--University of Hawaii at Manoa, 2008. === DNA microarray technology has provided researchers a high-throughput means to simultaneously measure expression levels for thousands of genes in an experiment. With a probit regression setting and assuming that the link function between significant gene expression data and latent variable for the response label is a Gaussian process, a kernel-induced hierarchical Bayesian framework is built for a cancer classification problem by using microarray gene expression data. === In summary, built on a Gaussian process model, a kernel-induced hierarchical Bayesian framework using microarray gene expression data for a cancer multi-classification problem is presented in this study. Our main contribution is a fully automated learning algorithm to solve this Bayesian model. Satisfactory results have been achieved in both the simulated examples and the real-world data studies. === Six published microarray datasets were analyzed in this study. The results show that predictive performance of our method for all these datasets is better than or at least as good as that of other state-of-the-art microarray analysis methods. Our method especially shows its superiority in analyzing one dataset that contains multiple suspicious mislabeled samples. For each of these datasets, we identified a set of significant genes, which can be used for further biological inspection at genome level. === Targeting a multi-classification problem and adopting a variable selection approach with a Gibbs sample as core, we developed the algorithm, kernel-imbedded Gaussian Process (KIGP), to analyze microarray data under a Bayesian framework. Through a feature projection procedure and using a univariate ranking scheme as gene-selection strategy, we further designed an alternative microarray analysis model, natural kernel-imbedded Gaussian Process (NKIGP). In the end, embedded with a reversible jump Markov chain Monte Carlo (RJMCMC) model, we present an efficient algorithm with a cascading structure to unify the proposed methods of this study. === The simulated examples demonstrate that, our method performs almost always close to the Bayesian bound in both the cases with linear Bayesian classifiers and the cases with very non-linear Bayesian classifiers. Even with mislabeled training samples, our method is still robust, showing its broad usability to those microarray analysis problems that linear methods may work flakily. === Includes bibliographical references (leaves xxx-xxx). === Also available by subscription via World Wide Web === 179 leaves, bound 29 cm
author Zhao, Xin
spellingShingle Zhao, Xin
Bayesian learning framework with kernel-imbedded Gaussian processes applied to microarray analysis
author_facet Zhao, Xin
author_sort Zhao, Xin
title Bayesian learning framework with kernel-imbedded Gaussian processes applied to microarray analysis
title_short Bayesian learning framework with kernel-imbedded Gaussian processes applied to microarray analysis
title_full Bayesian learning framework with kernel-imbedded Gaussian processes applied to microarray analysis
title_fullStr Bayesian learning framework with kernel-imbedded Gaussian processes applied to microarray analysis
title_full_unstemmed Bayesian learning framework with kernel-imbedded Gaussian processes applied to microarray analysis
title_sort bayesian learning framework with kernel-imbedded gaussian processes applied to microarray analysis
publishDate 2011
url http://hdl.handle.net/10125/20510
work_keys_str_mv AT zhaoxin bayesianlearningframeworkwithkernelimbeddedgaussianprocessesappliedtomicroarrayanalysis
_version_ 1716506465384005632