Bioinformatics analyses of high-throughput genomic and transcriptomic data from nasopharyngeal carcinoma cell line, xenografts and associated Epstein-Barr virus

This thesis is the construct of a computational system for studying the nasopharyngeal carcinoma (NPC) using high-throughput sequencing data. The system involves several components, including discovery of gene fusion in NPC cell line, construction of Esptein-Barr virus (EBV) genome, and evaluation o...

Full description

Bibliographic Details
Other Authors: Tso, Kai Yuen (author.)
Format: Others
Language:English
Chinese
Published: 2014
Subjects:
Online Access:http://repository.lib.cuhk.edu.hk/en/item/cuhk-1291547
id ndltd-cuhk.edu.hk-oai-cuhk-dr-cuhk_1291547
record_format oai_dc
spelling ndltd-cuhk.edu.hk-oai-cuhk-dr-cuhk_12915472019-02-19T03:47:55Z Bioinformatics analyses of high-throughput genomic and transcriptomic data from nasopharyngeal carcinoma cell line, xenografts and associated Epstein-Barr virus CUHK electronic theses & dissertations collection Nasopharynx--Cancer--Genetic aspects Epstein-Barr virus Nasopharyngeal Neoplasms--genetics Epstein-Barr Virus Infections WV410 .T75 2014 This thesis is the construct of a computational system for studying the nasopharyngeal carcinoma (NPC) using high-throughput sequencing data. The system involves several components, including discovery of gene fusion in NPC cell line, construction of Esptein-Barr virus (EBV) genome, and evaluation on contaminated sequencing data alignment approaches. We successfully discovered a gene fusion (UBR5-ZNF423) in a NPC cell line (C666-1) which was verified by lab experiments and found in 8.3% of primary tumors. It was discovered the regulation of this gene affect the growth of cancer cell. We constructed the EBV genome in C666-1. It serves as an important reference for studying this important NPC cell line, which was the only NPC cell line in the world for a long time. We also evaluated three mapping approaches. Two of them are designed to filter out potential mouse contamination reads on human sequencing data, which can originate from NPC human-in-mouse xenografts. We found that special care should always be applied to contaminated data. Although direct mapping can give acceptable results if in most cases, the combined-based approached is suggested. It can effectively reduce false positive variants and maintain good enough numbers of true positive variants. Filtering approach is an alternative to the combined-based approach that can also effectively reduce contamination when memory is not sufficient. 本論文利用電腦有系統地研究鼻咽癌,當中的數據利用了高通量測序技術來定序。其中章節包括在鼻咽癌胞系中尋找融合基因、組建潛藏於人體可引致鼻咽癌的EB病毒基因組、還有評價幾種可處理受污染序列的序列排列方法。我們成功地在鼻咽癌胞系(C666-1)中發現出一個融合基因(UBR5-ZNF423),並在實驗中確定此成果,其中發現在原發腫瘤中有8.3%的樣本中找出此融合基因。此外,也發現這融合基因調控會影響到癌細胞的生長。C666-1鼻咽癌胞系在過往有一段很長的時間裡,都是全世界唯一的鼻咽癌胞系,因此它有非常重要的參考價值,在此研究,我們組建了在C666-1裡的EB病毒基因組,使它作為研究C666-1的參考樣本。另外,我們評價了三種處理排列的方法,其中兩種的設計能過濾部分人類序列數據當中老鼠基因組的污染,老鼠基因組的污染可以來自於異種移植,即把人類癌細腫瘤移植於老鼠身上種植,我們建議在情況許可下都使用特殊的處理方法而不是直接作序列排列。直接作序列排列數據雖然已有合理的表現,但相比之下組合基因組式序列排列方法能有效減少錯誤肯定的遺傳變異,並同時保留足夠多正確肯定的遺傳變異,所以組合基因組式序列排列方法應在情況許可下都使用它。過濾式序列排列方法也是一種特殊的處理方法,它也能有效減少錯誤肯定的遺傳變異,它對記憶體的需求比組合基因組式序列排列方法少,可在電腦的記憶體不足時使用它。 Tso, Kai Yuen. Thesis M.Phil. Chinese University of Hong Kong 2014. Includes bibliographical references (leaves 112-120). Abstracts also in Chinese. Title from PDF title page (viewed on 24, October, 2016). Detailed summary in vernacular field only. Tso, Kai Yuen (author.) Yip, Kevin Yuk-Lap (thesis advisor.) Chinese University of Hong Kong Graduate School. Division of Computer Science and Engineering. (degree granting institution.) 2014 Text bibliography text electronic resource electronic resource remote 1 online resource (xv, 159 leaves) : illustrations (some color) computer online resource cuhk:1291547 local: etd920160154 local: 991018540419703407 local: NP160414171650_6 eng chi Use of this resource is governed by the terms and conditions of the Creative Commons "Attribution-NonCommercial-NoDerivatives 4.0 International" License (http://creativecommons.org/licenses/by-nc-nd/4.0/) http://repository.lib.cuhk.edu.hk/en/islandora/object/cuhk%3A1291547/datastream/TN/view/Bioinformatics%20analyses%20of%20high-throughput%20genomic%20and%20transcriptomic%20data%20from%20nasopharyngeal%20carcinoma%20cell%20line%2C%20xenografts%20and%20associated%20Epstein-Barr%20virus.jpghttp://repository.lib.cuhk.edu.hk/en/item/cuhk-1291547
collection NDLTD
language English
Chinese
format Others
sources NDLTD
topic Nasopharynx--Cancer--Genetic aspects
Epstein-Barr virus
Nasopharyngeal Neoplasms--genetics
Epstein-Barr Virus Infections
WV410 .T75 2014
spellingShingle Nasopharynx--Cancer--Genetic aspects
Epstein-Barr virus
Nasopharyngeal Neoplasms--genetics
Epstein-Barr Virus Infections
WV410 .T75 2014
Bioinformatics analyses of high-throughput genomic and transcriptomic data from nasopharyngeal carcinoma cell line, xenografts and associated Epstein-Barr virus
description This thesis is the construct of a computational system for studying the nasopharyngeal carcinoma (NPC) using high-throughput sequencing data. The system involves several components, including discovery of gene fusion in NPC cell line, construction of Esptein-Barr virus (EBV) genome, and evaluation on contaminated sequencing data alignment approaches. We successfully discovered a gene fusion (UBR5-ZNF423) in a NPC cell line (C666-1) which was verified by lab experiments and found in 8.3% of primary tumors. It was discovered the regulation of this gene affect the growth of cancer cell. We constructed the EBV genome in C666-1. It serves as an important reference for studying this important NPC cell line, which was the only NPC cell line in the world for a long time. We also evaluated three mapping approaches. Two of them are designed to filter out potential mouse contamination reads on human sequencing data, which can originate from NPC human-in-mouse xenografts. We found that special care should always be applied to contaminated data. Although direct mapping can give acceptable results if in most cases, the combined-based approached is suggested. It can effectively reduce false positive variants and maintain good enough numbers of true positive variants. Filtering approach is an alternative to the combined-based approach that can also effectively reduce contamination when memory is not sufficient. === 本論文利用電腦有系統地研究鼻咽癌,當中的數據利用了高通量測序技術來定序。其中章節包括在鼻咽癌胞系中尋找融合基因、組建潛藏於人體可引致鼻咽癌的EB病毒基因組、還有評價幾種可處理受污染序列的序列排列方法。我們成功地在鼻咽癌胞系(C666-1)中發現出一個融合基因(UBR5-ZNF423),並在實驗中確定此成果,其中發現在原發腫瘤中有8.3%的樣本中找出此融合基因。此外,也發現這融合基因調控會影響到癌細胞的生長。C666-1鼻咽癌胞系在過往有一段很長的時間裡,都是全世界唯一的鼻咽癌胞系,因此它有非常重要的參考價值,在此研究,我們組建了在C666-1裡的EB病毒基因組,使它作為研究C666-1的參考樣本。另外,我們評價了三種處理排列的方法,其中兩種的設計能過濾部分人類序列數據當中老鼠基因組的污染,老鼠基因組的污染可以來自於異種移植,即把人類癌細腫瘤移植於老鼠身上種植,我們建議在情況許可下都使用特殊的處理方法而不是直接作序列排列。直接作序列排列數據雖然已有合理的表現,但相比之下組合基因組式序列排列方法能有效減少錯誤肯定的遺傳變異,並同時保留足夠多正確肯定的遺傳變異,所以組合基因組式序列排列方法應在情況許可下都使用它。過濾式序列排列方法也是一種特殊的處理方法,它也能有效減少錯誤肯定的遺傳變異,它對記憶體的需求比組合基因組式序列排列方法少,可在電腦的記憶體不足時使用它。 === Tso, Kai Yuen. === Thesis M.Phil. Chinese University of Hong Kong 2014. === Includes bibliographical references (leaves 112-120). === Abstracts also in Chinese. === Title from PDF title page (viewed on 24, October, 2016). === Detailed summary in vernacular field only.
author2 Tso, Kai Yuen (author.)
author_facet Tso, Kai Yuen (author.)
title Bioinformatics analyses of high-throughput genomic and transcriptomic data from nasopharyngeal carcinoma cell line, xenografts and associated Epstein-Barr virus
title_short Bioinformatics analyses of high-throughput genomic and transcriptomic data from nasopharyngeal carcinoma cell line, xenografts and associated Epstein-Barr virus
title_full Bioinformatics analyses of high-throughput genomic and transcriptomic data from nasopharyngeal carcinoma cell line, xenografts and associated Epstein-Barr virus
title_fullStr Bioinformatics analyses of high-throughput genomic and transcriptomic data from nasopharyngeal carcinoma cell line, xenografts and associated Epstein-Barr virus
title_full_unstemmed Bioinformatics analyses of high-throughput genomic and transcriptomic data from nasopharyngeal carcinoma cell line, xenografts and associated Epstein-Barr virus
title_sort bioinformatics analyses of high-throughput genomic and transcriptomic data from nasopharyngeal carcinoma cell line, xenografts and associated epstein-barr virus
publishDate 2014
url http://repository.lib.cuhk.edu.hk/en/item/cuhk-1291547
_version_ 1718978337843445760