Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis
碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === The high-throughput sequencing technology provides an efficient and effective approach for discovering sequence contents and corresponding quantity of RNAs from a biological sample, and such an approach is called as RNA sequencing (RNA-seq). In recent years, R...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2016
|
Online Access: | http://ndltd.ncl.edu.tw/handle/w7bamz |
id |
ndltd-TW-104NTOU5394047 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-104NTOU53940472019-05-15T23:00:45Z http://ndltd.ncl.edu.tw/handle/w7bamz Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis 多模式物種選擇用於轉錄體定序功能分析 Li, Kuan-Hung 李冠葒 碩士 國立臺灣海洋大學 資訊工程學系 104 The high-throughput sequencing technology provides an efficient and effective approach for discovering sequence contents and corresponding quantity of RNAs from a biological sample, and such an approach is called as RNA sequencing (RNA-seq). In recent years, RNA-seq related applications explore rapidly due to its high throughput mechanism and relatively fast experiment capability that brings an unprecedented development on gene functional annotation, gene regulation analysis, and environmental factorization verification. RNA-seq has been applied for various fields based on detection of differential gene expression analysis, however, with the increasing amount of sequenced reads and reference model species, how to choose appropriate reference species for gene annotation has become a new challenge. Therefore, this study proposed a novel approach for finding the most effective reference model species through ultra-conserved orthologous genes (UCO) comparison among species. An online system of multiple species selection (MSS) for RNA-seq differential expression analysis was developed, and a total of 167 reference model species in eukaryotes were constructed and retrieved from the RefSeq, KEGG and UniProt online databases. The system is not only to provide selection of appropriate reference species through UCO and Taxonomy associations, but also allow users to perform differential expression analysis through gene ontology and biological pathway approaches for functional annotation. In this thesis, we verified the correlation of UCO gene distance matrices among species and evaluated the results by various reference species selection for RNA-seq datasets from a de novo organism. The results showed that through selecting multiple appropriate species could solve the problem of lacking annotated information and obtain more accurate results than single model reference species. Pai, Tun-Wen 白敦文 2016 學位論文 ; thesis 38 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === The high-throughput sequencing technology provides an efficient and effective approach for discovering sequence contents and corresponding quantity of RNAs from a biological sample, and such an approach is called as RNA sequencing (RNA-seq). In recent years, RNA-seq related applications explore rapidly due to its high throughput mechanism and relatively fast experiment capability that brings an unprecedented development on gene functional annotation, gene regulation analysis, and environmental factorization verification. RNA-seq has been applied for various fields based on detection of differential gene expression analysis, however, with the increasing amount of sequenced reads and reference model species, how to choose appropriate reference species for gene annotation has become a new challenge. Therefore, this study proposed a novel approach for finding the most effective reference model species through ultra-conserved orthologous genes (UCO) comparison among species. An online system of multiple species selection (MSS) for RNA-seq differential expression analysis was developed, and a total of 167 reference model species in eukaryotes were constructed and retrieved from the RefSeq, KEGG and UniProt online databases. The system is not only to provide selection of appropriate reference species through UCO and Taxonomy associations, but also allow users to perform differential expression analysis through gene ontology and biological pathway approaches for functional annotation. In this thesis, we verified the correlation of UCO gene distance matrices among species and evaluated the results by various reference species selection for RNA-seq datasets from a de novo organism. The results showed that through selecting multiple appropriate species could solve the problem of lacking annotated information and obtain more accurate results than single model reference species.
|
author2 |
Pai, Tun-Wen |
author_facet |
Pai, Tun-Wen Li, Kuan-Hung 李冠葒 |
author |
Li, Kuan-Hung 李冠葒 |
spellingShingle |
Li, Kuan-Hung 李冠葒 Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis |
author_sort |
Li, Kuan-Hung |
title |
Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis |
title_short |
Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis |
title_full |
Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis |
title_fullStr |
Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis |
title_full_unstemmed |
Multiple Model Species Selection for Transcriptomics Sequencing and Functional Analysis |
title_sort |
multiple model species selection for transcriptomics sequencing and functional analysis |
publishDate |
2016 |
url |
http://ndltd.ncl.edu.tw/handle/w7bamz |
work_keys_str_mv |
AT likuanhung multiplemodelspeciesselectionfortranscriptomicssequencingandfunctionalanalysis AT lǐguānhóng multiplemodelspeciesselectionfortranscriptomicssequencingandfunctionalanalysis AT likuanhung duōmóshìwùzhǒngxuǎnzéyòngyúzhuǎnlùtǐdìngxùgōngnéngfēnxī AT lǐguānhóng duōmóshìwùzhǒngxuǎnzéyòngyúzhuǎnlùtǐdìngxùgōngnéngfēnxī |
_version_ |
1719138401522810880 |