Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data
<p>Abstract</p> <p>Background</p> <p>Visualization of DNA microarray data in two or three dimensional spaces is an important exploratory analysis step in order to detect quality issues or to generate new hypotheses. Principal Component Analysis (PCA) is a widely used li...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2010-11-01
|
Series: | BMC Bioinformatics |
Online Access: | http://www.biomedcentral.com/1471-2105/11/567 |
id |
doaj-f78f9a37e5cb4b05b96b8c30375eb222 |
---|---|
record_format |
Article |
spelling |
doaj-f78f9a37e5cb4b05b96b8c30375eb2222020-11-24T23:29:57ZengBMCBMC Bioinformatics1471-21052010-11-0111156710.1186/1471-2105-11-567Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression dataJiang XiaoyiRuckert ChristianKlein Hans-UlrichBartenhagen ChristophDugas Martin<p>Abstract</p> <p>Background</p> <p>Visualization of DNA microarray data in two or three dimensional spaces is an important exploratory analysis step in order to detect quality issues or to generate new hypotheses. Principal Component Analysis (PCA) is a widely used linear method to define the mapping between the high-dimensional data and its low-dimensional representation. During the last decade, many new nonlinear methods for dimension reduction have been proposed, but it is still unclear how well these methods capture the underlying structure of microarray gene expression data. In this study, we assessed the performance of the PCA approach and of six nonlinear dimension reduction methods, namely Kernel PCA, Locally Linear Embedding, Isomap, Diffusion Maps, Laplacian Eigenmaps and Maximum Variance Unfolding, in terms of visualization of microarray data.</p> <p>Results</p> <p>A systematic benchmark, consisting of Support Vector Machine classification, cluster validation and noise evaluations was applied to ten microarray and several simulated datasets. Significant differences between PCA and most of the nonlinear methods were observed in two and three dimensional target spaces. With an increasing number of dimensions and an increasing number of differentially expressed genes, all methods showed similar performance. PCA and Diffusion Maps responded less sensitive to noise than the other nonlinear methods.</p> <p>Conclusions</p> <p>Locally Linear Embedding and Isomap showed a superior performance on all datasets. In very low-dimensional representations and with few differentially expressed genes, these two methods preserve more of the underlying structure of the data than PCA, and thus are favorable alternatives for the visualization of microarray data.</p> http://www.biomedcentral.com/1471-2105/11/567 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jiang Xiaoyi Ruckert Christian Klein Hans-Ulrich Bartenhagen Christoph Dugas Martin |
spellingShingle |
Jiang Xiaoyi Ruckert Christian Klein Hans-Ulrich Bartenhagen Christoph Dugas Martin Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data BMC Bioinformatics |
author_facet |
Jiang Xiaoyi Ruckert Christian Klein Hans-Ulrich Bartenhagen Christoph Dugas Martin |
author_sort |
Jiang Xiaoyi |
title |
Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data |
title_short |
Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data |
title_full |
Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data |
title_fullStr |
Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data |
title_full_unstemmed |
Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data |
title_sort |
comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2010-11-01 |
description |
<p>Abstract</p> <p>Background</p> <p>Visualization of DNA microarray data in two or three dimensional spaces is an important exploratory analysis step in order to detect quality issues or to generate new hypotheses. Principal Component Analysis (PCA) is a widely used linear method to define the mapping between the high-dimensional data and its low-dimensional representation. During the last decade, many new nonlinear methods for dimension reduction have been proposed, but it is still unclear how well these methods capture the underlying structure of microarray gene expression data. In this study, we assessed the performance of the PCA approach and of six nonlinear dimension reduction methods, namely Kernel PCA, Locally Linear Embedding, Isomap, Diffusion Maps, Laplacian Eigenmaps and Maximum Variance Unfolding, in terms of visualization of microarray data.</p> <p>Results</p> <p>A systematic benchmark, consisting of Support Vector Machine classification, cluster validation and noise evaluations was applied to ten microarray and several simulated datasets. Significant differences between PCA and most of the nonlinear methods were observed in two and three dimensional target spaces. With an increasing number of dimensions and an increasing number of differentially expressed genes, all methods showed similar performance. PCA and Diffusion Maps responded less sensitive to noise than the other nonlinear methods.</p> <p>Conclusions</p> <p>Locally Linear Embedding and Isomap showed a superior performance on all datasets. In very low-dimensional representations and with few differentially expressed genes, these two methods preserve more of the underlying structure of the data than PCA, and thus are favorable alternatives for the visualization of microarray data.</p> |
url |
http://www.biomedcentral.com/1471-2105/11/567 |
work_keys_str_mv |
AT jiangxiaoyi comparativestudyofunsuperviseddimensionreductiontechniquesforthevisualizationofmicroarraygeneexpressiondata AT ruckertchristian comparativestudyofunsuperviseddimensionreductiontechniquesforthevisualizationofmicroarraygeneexpressiondata AT kleinhansulrich comparativestudyofunsuperviseddimensionreductiontechniquesforthevisualizationofmicroarraygeneexpressiondata AT bartenhagenchristoph comparativestudyofunsuperviseddimensionreductiontechniquesforthevisualizationofmicroarraygeneexpressiondata AT dugasmartin comparativestudyofunsuperviseddimensionreductiontechniquesforthevisualizationofmicroarraygeneexpressiondata |
_version_ |
1725543462617481216 |