A module based approach for identifying driver genes and expanding pathways from integrated biological networks

Each gene or protein has its own function which, when combined with others, allows the group to perform more complex behaviors, e.g. carry out a particular cellular task (functional module) or affect a particular disease phenotype (disease module). One of the major challenges in systems biology is t...

Full description

Bibliographic Details
Main Author: Huang, Chia-Ling
Language:en_US
Published: 2016
Subjects:
Online Access:https://hdl.handle.net/2144/14289
id ndltd-bu.edu-oai-open.bu.edu-2144-14289
record_format oai_dc
spelling ndltd-bu.edu-oai-open.bu.edu-2144-142892020-01-29T15:02:16Z A module based approach for identifying driver genes and expanding pathways from integrated biological networks Huang, Chia-Ling Bioinformatics Breast cancer Driver Module Network Ovarian cancer Each gene or protein has its own function which, when combined with others, allows the group to perform more complex behaviors, e.g. carry out a particular cellular task (functional module) or affect a particular disease phenotype (disease module). One of the major challenges in systems biology is to reveal the roles of genes or proteins in functional modules or disease modules. In the first part of the dissertation, I present a data-driven method, Correlation Set Analysis (CSA), for comprehensively detecting active regulators in disease populations by integrating co-expression analysis and specific types of literature-derived causal relationships. Instead of investigating the co-expression level between regulators and their targets, I focus on coherence of regulatees of a regulator, e.g. downstream targets of a transcription factor. Using simulated datasets I show that my method can reach high true positive rate and true negative rate (>80%) even the regulatory relationships is weak (only 20% of regulatees are co-expressed). Using three separate real biological datasets I was able to recover well-known and as- yet undescribed, active regulators for each disease population. In the second part of the dissertation, I develop and apply a new computational algorithm for detecting modules of functionally related genes that are likely to drive malignant transformation. The algorithm takes as input the identity and locations of a small number of known oncogenes (a seed set) on a human genome functional linkage network (FLN). It then searches for a boundary surrounding a gene set encompassing the seed, such that the magnitude of the difference in linkage weights between interior-interior gene pairs, and interior-exterior gene pairs is maximized. Starting with small seed sets for breast and ovarian cancer, I successfully identify known and novel drivers in both cancer types. In the third part of the dissertation, I propose a module based approach for expanding manually curated functional modules. I use the KEGG pathway database as an example and the results show that my approach can successfully suggest both validated pathway members (genes that are assigned to a particular pathway by other manually curated pathway databases) and novel candidate pathway genes. 2016-02-04T19:12:53Z 2016-02-04T19:12:53Z 2014 2016-01-22T18:55:27Z Thesis/Dissertation https://hdl.handle.net/2144/14289 en_US
collection NDLTD
language en_US
sources NDLTD
topic Bioinformatics
Breast cancer
Driver
Module
Network
Ovarian cancer
spellingShingle Bioinformatics
Breast cancer
Driver
Module
Network
Ovarian cancer
Huang, Chia-Ling
A module based approach for identifying driver genes and expanding pathways from integrated biological networks
description Each gene or protein has its own function which, when combined with others, allows the group to perform more complex behaviors, e.g. carry out a particular cellular task (functional module) or affect a particular disease phenotype (disease module). One of the major challenges in systems biology is to reveal the roles of genes or proteins in functional modules or disease modules. In the first part of the dissertation, I present a data-driven method, Correlation Set Analysis (CSA), for comprehensively detecting active regulators in disease populations by integrating co-expression analysis and specific types of literature-derived causal relationships. Instead of investigating the co-expression level between regulators and their targets, I focus on coherence of regulatees of a regulator, e.g. downstream targets of a transcription factor. Using simulated datasets I show that my method can reach high true positive rate and true negative rate (>80%) even the regulatory relationships is weak (only 20% of regulatees are co-expressed). Using three separate real biological datasets I was able to recover well-known and as- yet undescribed, active regulators for each disease population. In the second part of the dissertation, I develop and apply a new computational algorithm for detecting modules of functionally related genes that are likely to drive malignant transformation. The algorithm takes as input the identity and locations of a small number of known oncogenes (a seed set) on a human genome functional linkage network (FLN). It then searches for a boundary surrounding a gene set encompassing the seed, such that the magnitude of the difference in linkage weights between interior-interior gene pairs, and interior-exterior gene pairs is maximized. Starting with small seed sets for breast and ovarian cancer, I successfully identify known and novel drivers in both cancer types. In the third part of the dissertation, I propose a module based approach for expanding manually curated functional modules. I use the KEGG pathway database as an example and the results show that my approach can successfully suggest both validated pathway members (genes that are assigned to a particular pathway by other manually curated pathway databases) and novel candidate pathway genes.
author Huang, Chia-Ling
author_facet Huang, Chia-Ling
author_sort Huang, Chia-Ling
title A module based approach for identifying driver genes and expanding pathways from integrated biological networks
title_short A module based approach for identifying driver genes and expanding pathways from integrated biological networks
title_full A module based approach for identifying driver genes and expanding pathways from integrated biological networks
title_fullStr A module based approach for identifying driver genes and expanding pathways from integrated biological networks
title_full_unstemmed A module based approach for identifying driver genes and expanding pathways from integrated biological networks
title_sort module based approach for identifying driver genes and expanding pathways from integrated biological networks
publishDate 2016
url https://hdl.handle.net/2144/14289
work_keys_str_mv AT huangchialing amodulebasedapproachforidentifyingdrivergenesandexpandingpathwaysfromintegratedbiologicalnetworks
AT huangchialing modulebasedapproachforidentifyingdrivergenesandexpandingpathwaysfromintegratedbiologicalnetworks
_version_ 1719310192422682624