Reconstruction of metabolic pathways by the exploration of gene expression data with factor analysis

Microarray gene expression data for thousands of genes in many organisms is quickly becoming available. The information this data can provide the experimental biologist is powerful. This data may provide information clarifying the regulatory linkages between genes within a single metabolic pathway...

Full description

Bibliographic Details
Main Author: Henderson, David Allen
Other Authors: Genetics
Format: Others
Published: Virginia Tech 2014
Subjects:
Online Access:http://hdl.handle.net/10919/30089
http://scholar.lib.vt.edu/theses/available/etd-12142001-175024/
id ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-30089
record_format oai_dc
spelling ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-300892020-10-17T06:35:20Z Reconstruction of metabolic pathways by the exploration of gene expression data with factor analysis Henderson, David Allen Genetics Hoeschele, Ina Smith, Eric P. Mendes, Pedro J. P. Saghai-Maroof, Mohammad A. Notter, David R. genetic regulation gene network microarray factor analysis Microarray gene expression data for thousands of genes in many organisms is quickly becoming available. The information this data can provide the experimental biologist is powerful. This data may provide information clarifying the regulatory linkages between genes within a single metabolic pathway, or alternative pathway routes under different environmental conditions, or provide information leading to the identification of genes for selection in animal and plant genetic improvement programs or targets for drug therapy. Many analysis methods to unlock this information have been both proposed and utilized, but not evaluated under known conditions (e.g. simulations). Within this dissertation, an analysis method is proposed and evaluated for identifying independent and linked metabolic pathways and compared to a popular analysis method. Also, this same analysis method is investigated for its ability to identify regulatory linkages within a single metabolic pathway. Lastly, a variant of this same method is used to analyze time series microarray data. In Chapter 2, Factor Analysis is shown to identify and group genes according to membership within independent metabolic pathways for steady state microarray gene expression data. There were cases, however, where the allocation of all genes to a pathway was not complete. A competing analysis method, Hierarchical Clustering, was shown to perform poorly when negatively correlated genes are assumed unrelated, but performance improved when the sign of the correlation coefficient was ignored. In Chapter 3, Factor Analysis is shown to identify regulatory relationships between genes within a single metabolic pathway. These relationships can be explained using metabolic control analysis, along with external knowledge of the pathway structure and activation and inhibition of transcription regulation. In this chapter, it is also shown why factor analysis can group genes by metabolic pathway using metabolic control analysis. In Chapter 4, a Bayesian exploratory factor analysis is developed and used to analyze microarray gene expression data. This Bayesian model differs from a previous implementation in that it is purely exploratory and can be used with vague or uninformative priors. Additionally, 95% highest posterior density regions can be calculated for each factor loading to aid in interpretation of factor loadings. A correlated Bayesian exploratory factor analysis model is also developed in this chapter for application to time series microarray gene expression data. While this method is appropriate for the analysis of correlated observation vectors, it fails to group genes by metabolic pathway for simulated time series data. Ph. D. 2014-03-14T20:20:17Z 2014-03-14T20:20:17Z 2001-12-14 2001-12-14 2002-12-18 2001-12-18 Dissertation etd-12142001-175024 http://hdl.handle.net/10919/30089 http://scholar.lib.vt.edu/theses/available/etd-12142001-175024/ daves_diss.pdf In Copyright http://rightsstatements.org/vocab/InC/1.0/ application/pdf Virginia Tech
collection NDLTD
format Others
sources NDLTD
topic genetic regulation
gene network
microarray
factor analysis
spellingShingle genetic regulation
gene network
microarray
factor analysis
Henderson, David Allen
Reconstruction of metabolic pathways by the exploration of gene expression data with factor analysis
description Microarray gene expression data for thousands of genes in many organisms is quickly becoming available. The information this data can provide the experimental biologist is powerful. This data may provide information clarifying the regulatory linkages between genes within a single metabolic pathway, or alternative pathway routes under different environmental conditions, or provide information leading to the identification of genes for selection in animal and plant genetic improvement programs or targets for drug therapy. Many analysis methods to unlock this information have been both proposed and utilized, but not evaluated under known conditions (e.g. simulations). Within this dissertation, an analysis method is proposed and evaluated for identifying independent and linked metabolic pathways and compared to a popular analysis method. Also, this same analysis method is investigated for its ability to identify regulatory linkages within a single metabolic pathway. Lastly, a variant of this same method is used to analyze time series microarray data. In Chapter 2, Factor Analysis is shown to identify and group genes according to membership within independent metabolic pathways for steady state microarray gene expression data. There were cases, however, where the allocation of all genes to a pathway was not complete. A competing analysis method, Hierarchical Clustering, was shown to perform poorly when negatively correlated genes are assumed unrelated, but performance improved when the sign of the correlation coefficient was ignored. In Chapter 3, Factor Analysis is shown to identify regulatory relationships between genes within a single metabolic pathway. These relationships can be explained using metabolic control analysis, along with external knowledge of the pathway structure and activation and inhibition of transcription regulation. In this chapter, it is also shown why factor analysis can group genes by metabolic pathway using metabolic control analysis. In Chapter 4, a Bayesian exploratory factor analysis is developed and used to analyze microarray gene expression data. This Bayesian model differs from a previous implementation in that it is purely exploratory and can be used with vague or uninformative priors. Additionally, 95% highest posterior density regions can be calculated for each factor loading to aid in interpretation of factor loadings. A correlated Bayesian exploratory factor analysis model is also developed in this chapter for application to time series microarray gene expression data. While this method is appropriate for the analysis of correlated observation vectors, it fails to group genes by metabolic pathway for simulated time series data. === Ph. D.
author2 Genetics
author_facet Genetics
Henderson, David Allen
author Henderson, David Allen
author_sort Henderson, David Allen
title Reconstruction of metabolic pathways by the exploration of gene expression data with factor analysis
title_short Reconstruction of metabolic pathways by the exploration of gene expression data with factor analysis
title_full Reconstruction of metabolic pathways by the exploration of gene expression data with factor analysis
title_fullStr Reconstruction of metabolic pathways by the exploration of gene expression data with factor analysis
title_full_unstemmed Reconstruction of metabolic pathways by the exploration of gene expression data with factor analysis
title_sort reconstruction of metabolic pathways by the exploration of gene expression data with factor analysis
publisher Virginia Tech
publishDate 2014
url http://hdl.handle.net/10919/30089
http://scholar.lib.vt.edu/theses/available/etd-12142001-175024/
work_keys_str_mv AT hendersondavidallen reconstructionofmetabolicpathwaysbytheexplorationofgeneexpressiondatawithfactoranalysis
_version_ 1719352744673804288