Regulatory Genes Prediction with Microarray Data and Ontology
博士 === 淡江大學 === 資訊工程學系博士班 === 99 === Microarray technology provides an opportunity for scientists to analyze thousands of gene expression profiles simultaneously. However, microarray gene expression data often contain multiple missing expression values due to many reasons. Effective methods to imput...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2011
|
Online Access: | http://ndltd.ncl.edu.tw/handle/96290397700854506547 |
id |
ndltd-TW-099TKU05392007 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-099TKU053920072015-10-30T04:10:10Z http://ndltd.ncl.edu.tw/handle/96290397700854506547 Regulatory Genes Prediction with Microarray Data and Ontology 使用微陣列資料與本體論於基因調控關係預測 Chao-Hsun Yang 楊朝勛 博士 淡江大學 資訊工程學系博士班 99 Microarray technology provides an opportunity for scientists to analyze thousands of gene expression profiles simultaneously. However, microarray gene expression data often contain multiple missing expression values due to many reasons. Effective methods to impute these missing values are needed since many algorithms for microarray data analysis require a complete matrix of gene expression values. In addition, selecting informative genes from microarray gene expression data is essential while performing data analysis on these large amounts of data. To fit this need, a number of methods were proposed from various points of view. However, most existing methods have their limitations and disadvantages. In this dissertation, we propose a novel approach to predict potential regulatory gene pairs through our distance measurement that estimates the distances between gene pairs effectively. The distance measurement is based on the dynamic time warping (DTW) algorithm and the well-defined gene ontology (GO) structure for genes or proteins. GO contains definition (annotations) for genes that describe the biological meanings of them. The semantic distance of two genes within biological aspect can be measured by performing proper quantitative assessments of their corresponding GO annotations. Our distance measurement takes both DTW distances of expression values and GO semantic distances of gene pairs into consideration. Besides, we also propose a novel missing value imputation approach by combining our distance measurement with the k-nearest neighbor (KNN) method. Experimental results show that our missing value imputation approach outperforms other major methods in terms of the commonly-used assessment. After missing values in microarray time series raw data are estimated effectively with our imputation approach, we then perform our gene regulation prediction approach. According to experimental results, our approach can discover more known regulatory gene pairs compared with other methods. Researches on microarray time series data can hence be improved and facilitated with our approaches. 許輝煌 2011 學位論文 ; thesis 121 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 淡江大學 === 資訊工程學系博士班 === 99 === Microarray technology provides an opportunity for scientists to analyze thousands of gene expression profiles simultaneously. However, microarray gene expression data often contain multiple missing expression values due to many reasons. Effective methods to impute these missing values are needed since many algorithms for microarray data analysis require a complete matrix of gene expression values. In addition, selecting informative genes from microarray gene expression data is essential while performing data analysis on these large amounts of data. To fit this need, a number of methods were proposed from various points of view. However, most existing methods have their limitations and disadvantages.
In this dissertation, we propose a novel approach to predict potential regulatory gene pairs through our distance measurement that estimates the distances between gene pairs effectively. The distance measurement is based on the dynamic time warping (DTW) algorithm and the well-defined gene ontology (GO) structure for genes or proteins. GO contains definition (annotations) for genes that describe the biological meanings of them. The semantic distance of two genes within biological aspect can be measured by performing proper quantitative assessments of their corresponding GO annotations. Our distance measurement takes both DTW distances of expression values and GO semantic distances of gene pairs into consideration.
Besides, we also propose a novel missing value imputation approach by combining our distance measurement with the k-nearest neighbor (KNN) method. Experimental results show that our missing value imputation approach outperforms other major methods in terms of the commonly-used assessment. After missing values in microarray time series raw data are estimated effectively with our imputation approach, we then perform our gene regulation prediction approach. According to experimental results, our approach can discover more known regulatory gene pairs compared with other methods. Researches on microarray time series data can hence be improved and facilitated with our approaches.
|
author2 |
許輝煌 |
author_facet |
許輝煌 Chao-Hsun Yang 楊朝勛 |
author |
Chao-Hsun Yang 楊朝勛 |
spellingShingle |
Chao-Hsun Yang 楊朝勛 Regulatory Genes Prediction with Microarray Data and Ontology |
author_sort |
Chao-Hsun Yang |
title |
Regulatory Genes Prediction with Microarray Data and Ontology |
title_short |
Regulatory Genes Prediction with Microarray Data and Ontology |
title_full |
Regulatory Genes Prediction with Microarray Data and Ontology |
title_fullStr |
Regulatory Genes Prediction with Microarray Data and Ontology |
title_full_unstemmed |
Regulatory Genes Prediction with Microarray Data and Ontology |
title_sort |
regulatory genes prediction with microarray data and ontology |
publishDate |
2011 |
url |
http://ndltd.ncl.edu.tw/handle/96290397700854506547 |
work_keys_str_mv |
AT chaohsunyang regulatorygenespredictionwithmicroarraydataandontology AT yángcháoxūn regulatorygenespredictionwithmicroarraydataandontology AT chaohsunyang shǐyòngwēizhènlièzīliàoyǔběntǐlùnyújīyīndiàokòngguānxìyùcè AT yángcháoxūn shǐyòngwēizhènlièzīliàoyǔběntǐlùnyújīyīndiàokòngguānxìyùcè |
_version_ |
1718116793289015296 |