Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis

碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === The high-throughput sequencing technology provides an efficient and effective approach for discovering sequence contents and corresponding quantity of RNAs from a biological sample, and such an approach is called as RNA sequencing (RNA-seq). In recent years, R...

Full description

Bibliographic Details
Main Authors: Li, Kuan-Hung, 李冠葒
Other Authors: Pai, Tun-Wen
Format: Others
Language:en_US
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/w7bamz
id ndltd-TW-104NTOU5394047
record_format oai_dc
spelling ndltd-TW-104NTOU53940472019-05-15T23:00:45Z http://ndltd.ncl.edu.tw/handle/w7bamz Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis 多模式物種選擇用於轉錄體定序功能分析 Li, Kuan-Hung 李冠葒 碩士 國立臺灣海洋大學 資訊工程學系 104 The high-throughput sequencing technology provides an efficient and effective approach for discovering sequence contents and corresponding quantity of RNAs from a biological sample, and such an approach is called as RNA sequencing (RNA-seq). In recent years, RNA-seq related applications explore rapidly due to its high throughput mechanism and relatively fast experiment capability that brings an unprecedented development on gene functional annotation, gene regulation analysis, and environmental factorization verification. RNA-seq has been applied for various fields based on detection of differential gene expression analysis, however, with the increasing amount of sequenced reads and reference model species, how to choose appropriate reference species for gene annotation has become a new challenge. Therefore, this study proposed a novel approach for finding the most effective reference model species through ultra-conserved orthologous genes (UCO) comparison among species. An online system of multiple species selection (MSS) for RNA-seq differential expression analysis was developed, and a total of 167 reference model species in eukaryotes were constructed and retrieved from the RefSeq, KEGG and UniProt online databases. The system is not only to provide selection of appropriate reference species through UCO and Taxonomy associations, but also allow users to perform differential expression analysis through gene ontology and biological pathway approaches for functional annotation. In this thesis, we verified the correlation of UCO gene distance matrices among species and evaluated the results by various reference species selection for RNA-seq datasets from a de novo organism. The results showed that through selecting multiple appropriate species could solve the problem of lacking annotated information and obtain more accurate results than single model reference species. Pai, Tun-Wen 白敦文 2016 學位論文 ; thesis 38 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === The high-throughput sequencing technology provides an efficient and effective approach for discovering sequence contents and corresponding quantity of RNAs from a biological sample, and such an approach is called as RNA sequencing (RNA-seq). In recent years, RNA-seq related applications explore rapidly due to its high throughput mechanism and relatively fast experiment capability that brings an unprecedented development on gene functional annotation, gene regulation analysis, and environmental factorization verification. RNA-seq has been applied for various fields based on detection of differential gene expression analysis, however, with the increasing amount of sequenced reads and reference model species, how to choose appropriate reference species for gene annotation has become a new challenge. Therefore, this study proposed a novel approach for finding the most effective reference model species through ultra-conserved orthologous genes (UCO) comparison among species. An online system of multiple species selection (MSS) for RNA-seq differential expression analysis was developed, and a total of 167 reference model species in eukaryotes were constructed and retrieved from the RefSeq, KEGG and UniProt online databases. The system is not only to provide selection of appropriate reference species through UCO and Taxonomy associations, but also allow users to perform differential expression analysis through gene ontology and biological pathway approaches for functional annotation. In this thesis, we verified the correlation of UCO gene distance matrices among species and evaluated the results by various reference species selection for RNA-seq datasets from a de novo organism. The results showed that through selecting multiple appropriate species could solve the problem of lacking annotated information and obtain more accurate results than single model reference species.
author2 Pai, Tun-Wen
author_facet Pai, Tun-Wen
Li, Kuan-Hung
李冠葒
author Li, Kuan-Hung
李冠葒
spellingShingle Li, Kuan-Hung
李冠葒
Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis
author_sort Li, Kuan-Hung
title Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis
title_short Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis
title_full Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis
title_fullStr Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis
title_full_unstemmed Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis
title_sort multiple model species selection for transcriptomics sequencing and functional analysis
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/w7bamz
work_keys_str_mv AT likuanhung multiplemodelspeciesselectionfortranscriptomicssequencingandfunctionalanalysis
AT lǐguānhóng multiplemodelspeciesselectionfortranscriptomicssequencingandfunctionalanalysis
AT likuanhung duōmóshìwùzhǒngxuǎnzéyòngyúzhuǎnlùtǐdìngxùgōngnéngfēnxī
AT lǐguānhóng duōmóshìwùzhǒngxuǎnzéyòngyúzhuǎnlùtǐdìngxùgōngnéngfēnxī
_version_ 1719138401522810880