Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods
Projections are conventional methods of dimensionality reduction for information visualization used to transform high-dimensional data into low dimensional space. If the projection method restricts the output space to two dimensions, the result is a scatter plot. The goal of this scatter plot is to...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2020-01-01
|
Series: | MethodsX |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2215016120303137 |
id |
doaj-03c37bea42264da78d71c92facd28767 |
---|---|
record_format |
Article |
spelling |
doaj-03c37bea42264da78d71c92facd287672021-01-02T05:11:03ZengElsevierMethodsX2215-01612020-01-017101093Uncovering High-dimensional Structures of Projections from Dimensionality Reduction MethodsMichael C. Thrun, PhD0Alfred Ultsch, Prof. Dr. habil.1Dept. of Hematology, Oncology and Immunology, Philipps-University of Marburg, Baldingerstraße, D-35043 Marburg; Corresponding author.Databionics Research Group, Philipps-University of Marburg, Hans-Meerwein-Straße 6, Marburg D-35032, GermanyProjections are conventional methods of dimensionality reduction for information visualization used to transform high-dimensional data into low dimensional space. If the projection method restricts the output space to two dimensions, the result is a scatter plot. The goal of this scatter plot is to visualize the relative relationships between high-dimensional data points that build up distance and density-based structures. However, the Johnson–Lindenstrauss lemma states that the two-dimensional similarities in the scatter plot cannot coercively represent high-dimensional structures. Here, a simplified emergent self-organizing map uses the projected points of such a scatter plot in combination with the dataset in order to compute the generalized U-matrix. The generalized U-matrix defines the visualization of a topographic map depicting the misrepresentations of projected points with regards to a given dimensionality reduction method and the dataset. • The topographic map provides accurate information about the high-dimensional distance and density based structures of high-dimensional data if an appropriate dimensionality reduction method is selected. • The topographic map can uncover the absence of distance-based structures. • The topographic map reveals the number of clusters in a dataset as the number of valleys.http://www.sciencedirect.com/science/article/pii/S2215016120303137Dimensionality reductionProjection methodsData visualizationUnsupervised neural networksSelf-organizing maps |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Michael C. Thrun, PhD Alfred Ultsch, Prof. Dr. habil. |
spellingShingle |
Michael C. Thrun, PhD Alfred Ultsch, Prof. Dr. habil. Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods MethodsX Dimensionality reduction Projection methods Data visualization Unsupervised neural networks Self-organizing maps |
author_facet |
Michael C. Thrun, PhD Alfred Ultsch, Prof. Dr. habil. |
author_sort |
Michael C. Thrun, PhD |
title |
Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods |
title_short |
Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods |
title_full |
Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods |
title_fullStr |
Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods |
title_full_unstemmed |
Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods |
title_sort |
uncovering high-dimensional structures of projections from dimensionality reduction methods |
publisher |
Elsevier |
series |
MethodsX |
issn |
2215-0161 |
publishDate |
2020-01-01 |
description |
Projections are conventional methods of dimensionality reduction for information visualization used to transform high-dimensional data into low dimensional space. If the projection method restricts the output space to two dimensions, the result is a scatter plot. The goal of this scatter plot is to visualize the relative relationships between high-dimensional data points that build up distance and density-based structures. However, the Johnson–Lindenstrauss lemma states that the two-dimensional similarities in the scatter plot cannot coercively represent high-dimensional structures. Here, a simplified emergent self-organizing map uses the projected points of such a scatter plot in combination with the dataset in order to compute the generalized U-matrix. The generalized U-matrix defines the visualization of a topographic map depicting the misrepresentations of projected points with regards to a given dimensionality reduction method and the dataset. • The topographic map provides accurate information about the high-dimensional distance and density based structures of high-dimensional data if an appropriate dimensionality reduction method is selected. • The topographic map can uncover the absence of distance-based structures. • The topographic map reveals the number of clusters in a dataset as the number of valleys. |
topic |
Dimensionality reduction Projection methods Data visualization Unsupervised neural networks Self-organizing maps |
url |
http://www.sciencedirect.com/science/article/pii/S2215016120303137 |
work_keys_str_mv |
AT michaelcthrunphd uncoveringhighdimensionalstructuresofprojectionsfromdimensionalityreductionmethods AT alfredultschprofdrhabil uncoveringhighdimensionalstructuresofprojectionsfromdimensionalityreductionmethods |
_version_ |
1724359370669555712 |