Improving Overrepresentation Analysis of Gene Ontology based on Long Noncoding RNA Association

碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 106 === Gene Ontology (GO) overrepresentation analysis is mainly applied to explain correlated behaviors of differentially expressed genes. In traditional approaches, differentially expressed gene cluster were analyzed if they could gather within a specific GO term, an...

Full description

Bibliographic Details
Main Authors: Hsiao, Chung-Chi, 蕭仲圻
Other Authors: Pai, Tun-Wen
Format: Others
Language:zh-TW
Published: 2018
Online Access:http://ndltd.ncl.edu.tw/handle/42ubey
id ndltd-TW-106NTOU5394027
record_format oai_dc
spelling ndltd-TW-106NTOU53940272019-05-16T00:59:42Z http://ndltd.ncl.edu.tw/handle/42ubey Improving Overrepresentation Analysis of Gene Ontology based on Long Noncoding RNA Association 使用長鏈非編碼RNA相關性改善基因本體論之過表現分析 Hsiao, Chung-Chi 蕭仲圻 碩士 國立臺灣海洋大學 資訊工程學系 106 Gene Ontology (GO) overrepresentation analysis is mainly applied to explain correlated behaviors of differentially expressed genes. In traditional approaches, differentially expressed gene cluster were analyzed if they could gather within a specific GO term, and hypergeometric distribution statistics were applied to calculate a corresponding p-value for each GO term. GO terms with a lower p-value is considered as more relevant to the biological experiment. The traditional analysis ignores some hidden interactions between genes, for example, long noncoding RNAs (lncRNAs) might regulate and inhibit their target genes and lead to reduce significance of some GO terms. There is another problem of inheritance attributes of GO hierarchical structure. Top-level GO terms belonging to general categories always possess lower p-values due to non-uniformity distributions between annotated genes and associated annotations. Therefore, we proposed adequate solutions to overcome these two problems and to increase effectiveness and accuracies of GO overrepresentation analysis. First, we assumed that differentially expressed long non-coding RNAs (lncRNAs) might regulate their neighboring genes through evaluating whether these lncRNAs overlapped with neighboring genes or transcription factor binding sites(TFBS) of neighboring genes. If these conditions appear, no matter the neighboring genes possessing differential expressions, they would be accounted for GO functional overrepresentation analysis. In addition, according to the GO hierarchical structure, a GO term with a significantly low p-value could be removed if the parent node possesses any child GO terms with significant p-values. In order to validate of the proposed system, we used two RNA-seq experiments of birc5a knock-down and birc5a knock-out in zebrafish embryogenesis. For the birc5a knock-down experiment, compared with the traditional GO overrepresentation analysis, 5 additional neuron development related GO terms and 4 additional calcium-channel related GO terms were discovered; for the birc5a knock-out experiment, 3 additional neuron development related GO terms and 3 more calcium related GO terms were identified. Several papers were published to validate it associations. To summarize it all, the proposed approaches provide an accurate functional annotation method for biological and medical researchers in their transcriptome-related experiments. Pai, Tun-Wen 白敦文 2018 學位論文 ; thesis 32 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 106 === Gene Ontology (GO) overrepresentation analysis is mainly applied to explain correlated behaviors of differentially expressed genes. In traditional approaches, differentially expressed gene cluster were analyzed if they could gather within a specific GO term, and hypergeometric distribution statistics were applied to calculate a corresponding p-value for each GO term. GO terms with a lower p-value is considered as more relevant to the biological experiment. The traditional analysis ignores some hidden interactions between genes, for example, long noncoding RNAs (lncRNAs) might regulate and inhibit their target genes and lead to reduce significance of some GO terms. There is another problem of inheritance attributes of GO hierarchical structure. Top-level GO terms belonging to general categories always possess lower p-values due to non-uniformity distributions between annotated genes and associated annotations. Therefore, we proposed adequate solutions to overcome these two problems and to increase effectiveness and accuracies of GO overrepresentation analysis. First, we assumed that differentially expressed long non-coding RNAs (lncRNAs) might regulate their neighboring genes through evaluating whether these lncRNAs overlapped with neighboring genes or transcription factor binding sites(TFBS) of neighboring genes. If these conditions appear, no matter the neighboring genes possessing differential expressions, they would be accounted for GO functional overrepresentation analysis. In addition, according to the GO hierarchical structure, a GO term with a significantly low p-value could be removed if the parent node possesses any child GO terms with significant p-values. In order to validate of the proposed system, we used two RNA-seq experiments of birc5a knock-down and birc5a knock-out in zebrafish embryogenesis. For the birc5a knock-down experiment, compared with the traditional GO overrepresentation analysis, 5 additional neuron development related GO terms and 4 additional calcium-channel related GO terms were discovered; for the birc5a knock-out experiment, 3 additional neuron development related GO terms and 3 more calcium related GO terms were identified. Several papers were published to validate it associations. To summarize it all, the proposed approaches provide an accurate functional annotation method for biological and medical researchers in their transcriptome-related experiments.
author2 Pai, Tun-Wen
author_facet Pai, Tun-Wen
Hsiao, Chung-Chi
蕭仲圻
author Hsiao, Chung-Chi
蕭仲圻
spellingShingle Hsiao, Chung-Chi
蕭仲圻
Improving Overrepresentation Analysis of Gene Ontology based on Long Noncoding RNA Association
author_sort Hsiao, Chung-Chi
title Improving Overrepresentation Analysis of Gene Ontology based on Long Noncoding RNA Association
title_short Improving Overrepresentation Analysis of Gene Ontology based on Long Noncoding RNA Association
title_full Improving Overrepresentation Analysis of Gene Ontology based on Long Noncoding RNA Association
title_fullStr Improving Overrepresentation Analysis of Gene Ontology based on Long Noncoding RNA Association
title_full_unstemmed Improving Overrepresentation Analysis of Gene Ontology based on Long Noncoding RNA Association
title_sort improving overrepresentation analysis of gene ontology based on long noncoding rna association
publishDate 2018
url http://ndltd.ncl.edu.tw/handle/42ubey
work_keys_str_mv AT hsiaochungchi improvingoverrepresentationanalysisofgeneontologybasedonlongnoncodingrnaassociation
AT xiāozhòngqí improvingoverrepresentationanalysisofgeneontologybasedonlongnoncodingrnaassociation
AT hsiaochungchi shǐyòngzhǎngliànfēibiānmǎrnaxiāngguānxìnggǎishànjīyīnběntǐlùnzhīguòbiǎoxiànfēnxī
AT xiāozhòngqí shǐyòngzhǎngliànfēibiānmǎrnaxiāngguānxìnggǎishànjīyīnběntǐlùnzhīguòbiǎoxiànfēnxī
_version_ 1719172730813677568