Summary: | 碩士 === 國立中央大學 === 系統生物與生物資訊研究所 === 100 === Gene set-based analysis (GSA) has been widely utilized on gene expression microarray to explore the association of biological features with phenotypes based on a prior pathway knowledge since its first application in 2003. GSA focuses on sets of related genes and has exhibited major advantages over on individual gene analysis (IGA) with respect to greater accuracy, robustness, and biological relevance. However, previous GSA studies have not considered the relationships within gene-sets which may shorten its functionalities and applications. Here, we presented an analytical framework called Gene Set-based Local Hierarchical Clustering (GSLHC) approach which may provide biologically valuable insights on coordinated actions on functionalities and improved classification of heterogeneous subtypes on drug-driven responses. We successfully applied GSLHC on the Connectivity Map (C-Map) dataset with various gene sets from the Molecular Signatures Database (MSigDB). The GSLHC approach eliminated cell type effects that was obviously observed by IGA and showed significantly better performance than IGA on sample clustering and drug-target association. Furthermore, based on sets of significantly enriched gene sets, GSLHC identified 18 unknown compounds which functionally associated with the most correlated drug neighbors, that 8 of them contain putative anti-cancer activities. With extended applicability, GSLHC will facilitate the gaining of the biological insights on unknown drug discovery, drug repositioning, gene-set pattern diagnosis of common disease, and function-based class categorization of heterogeneous cancer subtypes.
|