biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure.
Biclustering techniques have become very popular in cancer genetics studies, as they are tools that are expected to connect phenotypes to genotypes, i.e. to identify subgroups of cancer patients based on the fact that they share similar gene expression patterns as well as to identify subgroups of ge...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2014-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC4105625?pdf=render |
id |
doaj-051ffeda26604d3794ae43503374b09a |
---|---|
record_format |
Article |
spelling |
doaj-051ffeda26604d3794ae43503374b09a2020-11-25T02:33:38ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0197e10244510.1371/journal.pone.0102445biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure.Chia-Pei ChenHsieh FushingRob AtwillPatrice KoehlBiclustering techniques have become very popular in cancer genetics studies, as they are tools that are expected to connect phenotypes to genotypes, i.e. to identify subgroups of cancer patients based on the fact that they share similar gene expression patterns as well as to identify subgroups of genes that are specific to these subtypes of cancer and therefore could serve as biomarkers. In this paper we propose a new approach for identifying such relationships or biclusters between patients and gene expression profiles. This method, named biDCG, rests on two key concepts. First, it uses a new clustering technique, DCG-tree [Fushing et al, PLos One, 8, e56259 (2013)] that generates ultrametric topological spaces that capture the geometries of both the patient data set and the gene data set. Second, it optimizes the definitions of bicluster membership through an iterative two-way reclustering procedure in which patients and genes are reclustered in turn, based respectively on subsets of genes and patients defined in the previous round. We have validated biDCG on simulated and real data. Based on the simulated data we have shown that biDCG compares favorably to other biclustering techniques applied to cancer genomics data. The results on the real data sets have shown that biDCG is able to retrieve relevant biological information.http://europepmc.org/articles/PMC4105625?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Chia-Pei Chen Hsieh Fushing Rob Atwill Patrice Koehl |
spellingShingle |
Chia-Pei Chen Hsieh Fushing Rob Atwill Patrice Koehl biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure. PLoS ONE |
author_facet |
Chia-Pei Chen Hsieh Fushing Rob Atwill Patrice Koehl |
author_sort |
Chia-Pei Chen |
title |
biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure. |
title_short |
biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure. |
title_full |
biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure. |
title_fullStr |
biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure. |
title_full_unstemmed |
biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure. |
title_sort |
bidcg: a new method for discovering global features of dna microarray data via an iterative re-clustering procedure. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2014-01-01 |
description |
Biclustering techniques have become very popular in cancer genetics studies, as they are tools that are expected to connect phenotypes to genotypes, i.e. to identify subgroups of cancer patients based on the fact that they share similar gene expression patterns as well as to identify subgroups of genes that are specific to these subtypes of cancer and therefore could serve as biomarkers. In this paper we propose a new approach for identifying such relationships or biclusters between patients and gene expression profiles. This method, named biDCG, rests on two key concepts. First, it uses a new clustering technique, DCG-tree [Fushing et al, PLos One, 8, e56259 (2013)] that generates ultrametric topological spaces that capture the geometries of both the patient data set and the gene data set. Second, it optimizes the definitions of bicluster membership through an iterative two-way reclustering procedure in which patients and genes are reclustered in turn, based respectively on subsets of genes and patients defined in the previous round. We have validated biDCG on simulated and real data. Based on the simulated data we have shown that biDCG compares favorably to other biclustering techniques applied to cancer genomics data. The results on the real data sets have shown that biDCG is able to retrieve relevant biological information. |
url |
http://europepmc.org/articles/PMC4105625?pdf=render |
work_keys_str_mv |
AT chiapeichen bidcganewmethodfordiscoveringglobalfeaturesofdnamicroarraydataviaaniterativereclusteringprocedure AT hsiehfushing bidcganewmethodfordiscoveringglobalfeaturesofdnamicroarraydataviaaniterativereclusteringprocedure AT robatwill bidcganewmethodfordiscoveringglobalfeaturesofdnamicroarraydataviaaniterativereclusteringprocedure AT patricekoehl bidcganewmethodfordiscoveringglobalfeaturesofdnamicroarraydataviaaniterativereclusteringprocedure |
_version_ |
1724812570351632384 |