An Efficient Algorithm for Clustering Genomic Data

Bibliographic Details
Main Author: Zhou, Xuan
Language:English
Published: University of Cincinnati / OhioLINK 2014
Subjects:
Online Access:http://rave.ohiolink.edu/etdc/view?acc_num=ucin1418910389
id ndltd-OhioLink-oai-etd.ohiolink.edu-ucin1418910389
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-ucin14189103892021-08-03T06:28:50Z An Efficient Algorithm for Clustering Genomic Data Zhou, Xuan Computer Science genomic data clustering discretization 1D-Jury dimension reduction In this thesis, we investigated an efficient framework for clustering analysis of gene expression profiles by discretizing continuous genomic data and adopting the 1D-jury approach for fast clustering that was previously used for protein model quality assessment. We demonstrated, through an empirical analysis of multiple data sets from independent studies, that the loss of information due to discretization of genomic data is limited. Patterns observed using the original data can largely be recovered from discretized expression profiles, while enabling efficient identification of genomic signatures and clustering of expression profiles. We further studied the application of 1D-Jury approach in reducing the dimensionality of genomic data. We demonstrated that discretization and 1D-Jury score projection efficiently reduced the dimensionality of feature space. More importantly, the proposed discretization-projection heuristic enhanced the discovery of cluster structure and patterns in the data. Therefore, the proposed discretization-projection method can be a valuable tool for the analysis of gene expression data. 2014 English text University of Cincinnati / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=ucin1418910389 http://rave.ohiolink.edu/etdc/view?acc_num=ucin1418910389 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic Computer Science
genomic data
clustering
discretization
1D-Jury
dimension reduction
spellingShingle Computer Science
genomic data
clustering
discretization
1D-Jury
dimension reduction
Zhou, Xuan
An Efficient Algorithm for Clustering Genomic Data
author Zhou, Xuan
author_facet Zhou, Xuan
author_sort Zhou, Xuan
title An Efficient Algorithm for Clustering Genomic Data
title_short An Efficient Algorithm for Clustering Genomic Data
title_full An Efficient Algorithm for Clustering Genomic Data
title_fullStr An Efficient Algorithm for Clustering Genomic Data
title_full_unstemmed An Efficient Algorithm for Clustering Genomic Data
title_sort efficient algorithm for clustering genomic data
publisher University of Cincinnati / OhioLINK
publishDate 2014
url http://rave.ohiolink.edu/etdc/view?acc_num=ucin1418910389
work_keys_str_mv AT zhouxuan anefficientalgorithmforclusteringgenomicdata
AT zhouxuan efficientalgorithmforclusteringgenomicdata
_version_ 1719437738425450496