biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure.

Biclustering techniques have become very popular in cancer genetics studies, as they are tools that are expected to connect phenotypes to genotypes, i.e. to identify subgroups of cancer patients based on the fact that they share similar gene expression patterns as well as to identify subgroups of ge...

Full description

Bibliographic Details
Main Authors: Chia-Pei Chen, Hsieh Fushing, Rob Atwill, Patrice Koehl
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4105625?pdf=render
id doaj-051ffeda26604d3794ae43503374b09a
record_format Article
spelling doaj-051ffeda26604d3794ae43503374b09a2020-11-25T02:33:38ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0197e10244510.1371/journal.pone.0102445biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure.Chia-Pei ChenHsieh FushingRob AtwillPatrice KoehlBiclustering techniques have become very popular in cancer genetics studies, as they are tools that are expected to connect phenotypes to genotypes, i.e. to identify subgroups of cancer patients based on the fact that they share similar gene expression patterns as well as to identify subgroups of genes that are specific to these subtypes of cancer and therefore could serve as biomarkers. In this paper we propose a new approach for identifying such relationships or biclusters between patients and gene expression profiles. This method, named biDCG, rests on two key concepts. First, it uses a new clustering technique, DCG-tree [Fushing et al, PLos One, 8, e56259 (2013)] that generates ultrametric topological spaces that capture the geometries of both the patient data set and the gene data set. Second, it optimizes the definitions of bicluster membership through an iterative two-way reclustering procedure in which patients and genes are reclustered in turn, based respectively on subsets of genes and patients defined in the previous round. We have validated biDCG on simulated and real data. Based on the simulated data we have shown that biDCG compares favorably to other biclustering techniques applied to cancer genomics data. The results on the real data sets have shown that biDCG is able to retrieve relevant biological information.http://europepmc.org/articles/PMC4105625?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Chia-Pei Chen
Hsieh Fushing
Rob Atwill
Patrice Koehl
spellingShingle Chia-Pei Chen
Hsieh Fushing
Rob Atwill
Patrice Koehl
biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure.
PLoS ONE
author_facet Chia-Pei Chen
Hsieh Fushing
Rob Atwill
Patrice Koehl
author_sort Chia-Pei Chen
title biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure.
title_short biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure.
title_full biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure.
title_fullStr biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure.
title_full_unstemmed biDCG: a new method for discovering global features of DNA microarray data via an iterative re-clustering procedure.
title_sort bidcg: a new method for discovering global features of dna microarray data via an iterative re-clustering procedure.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2014-01-01
description Biclustering techniques have become very popular in cancer genetics studies, as they are tools that are expected to connect phenotypes to genotypes, i.e. to identify subgroups of cancer patients based on the fact that they share similar gene expression patterns as well as to identify subgroups of genes that are specific to these subtypes of cancer and therefore could serve as biomarkers. In this paper we propose a new approach for identifying such relationships or biclusters between patients and gene expression profiles. This method, named biDCG, rests on two key concepts. First, it uses a new clustering technique, DCG-tree [Fushing et al, PLos One, 8, e56259 (2013)] that generates ultrametric topological spaces that capture the geometries of both the patient data set and the gene data set. Second, it optimizes the definitions of bicluster membership through an iterative two-way reclustering procedure in which patients and genes are reclustered in turn, based respectively on subsets of genes and patients defined in the previous round. We have validated biDCG on simulated and real data. Based on the simulated data we have shown that biDCG compares favorably to other biclustering techniques applied to cancer genomics data. The results on the real data sets have shown that biDCG is able to retrieve relevant biological information.
url http://europepmc.org/articles/PMC4105625?pdf=render
work_keys_str_mv AT chiapeichen bidcganewmethodfordiscoveringglobalfeaturesofdnamicroarraydataviaaniterativereclusteringprocedure
AT hsiehfushing bidcganewmethodfordiscoveringglobalfeaturesofdnamicroarraydataviaaniterativereclusteringprocedure
AT robatwill bidcganewmethodfordiscoveringglobalfeaturesofdnamicroarraydataviaaniterativereclusteringprocedure
AT patricekoehl bidcganewmethodfordiscoveringglobalfeaturesofdnamicroarraydataviaaniterativereclusteringprocedure
_version_ 1724812570351632384