Statistical aspects of persistent homology

This thesis investigates statistical approaches to interpreting the output of persistent homology, a multi-resolution algorithm for discovering topological structure in data. We provide a brief introduction to the theory of topology and homology. The output is a set of intervals, visualised either a...

Full description

Bibliographic Details
Main Author:	Arnold, Matthew George
Published:	University of Bristol 2015
Subjects:	514
Online Access:	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.682184

id	ndltd-bl.uk-oai-ethos.bl.uk-682184
record_format	oai_dc
spelling	ndltd-bl.uk-oai-ethos.bl.uk-6821842017-03-16T16:23:30ZStatistical aspects of persistent homologyArnold, Matthew George2015This thesis investigates statistical approaches to interpreting the output of persistent homology, a multi-resolution algorithm for discovering topological structure in data. We provide a brief introduction to the theory of topology and homology. The output is a set of intervals, visualised either as a 'barcode' or as a set of points called a persistence diagram. We discuss suitable metrics for persistence diagrams. The following chapter demonstrates how to compute persistent homology using R. Following this foundational work, we find a confidence set for the true persistence diagram of the underlying space using a sample diagram. Such sets aid with the interpretation of persistence diagrams by identifying points that are likely representative of true topological features, and those points that are noise due to sampling. We present two methods of constructing confidence sets. The first assumes that the support of the sampling density is not too 'spiky'. The second method uses a stronger assumption that the data are a realisation of a homogeneous Poisson process, which leads to a less conservative confidence set. In the middle section of this thesis, we investigate further sampling properties of persistence diagrams. Sampling on the circle leads us to propose a barcode test of sampling uniformity. We look at the diagrams of samples from the unit square, which is topologically simple, and propose these as a model for the noise in diagrams from other spaces. We propose density corrected persistent homology that makes sample diagrams less sensitive to the geometry of the underlying space and the sampling density. In the last section of this thesis, we demonstrate how persistent homology can be used to identify topological structure in correlation and partial correlation matrices. This relates to the problem of structure learning in graphical models.514University of Bristolhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.682184Electronic Thesis or Dissertation
collection	NDLTD
sources	NDLTD
topic	514
spellingShingle	514 Arnold, Matthew George Statistical aspects of persistent homology
description	This thesis investigates statistical approaches to interpreting the output of persistent homology, a multi-resolution algorithm for discovering topological structure in data. We provide a brief introduction to the theory of topology and homology. The output is a set of intervals, visualised either as a 'barcode' or as a set of points called a persistence diagram. We discuss suitable metrics for persistence diagrams. The following chapter demonstrates how to compute persistent homology using R. Following this foundational work, we find a confidence set for the true persistence diagram of the underlying space using a sample diagram. Such sets aid with the interpretation of persistence diagrams by identifying points that are likely representative of true topological features, and those points that are noise due to sampling. We present two methods of constructing confidence sets. The first assumes that the support of the sampling density is not too 'spiky'. The second method uses a stronger assumption that the data are a realisation of a homogeneous Poisson process, which leads to a less conservative confidence set. In the middle section of this thesis, we investigate further sampling properties of persistence diagrams. Sampling on the circle leads us to propose a barcode test of sampling uniformity. We look at the diagrams of samples from the unit square, which is topologically simple, and propose these as a model for the noise in diagrams from other spaces. We propose density corrected persistent homology that makes sample diagrams less sensitive to the geometry of the underlying space and the sampling density. In the last section of this thesis, we demonstrate how persistent homology can be used to identify topological structure in correlation and partial correlation matrices. This relates to the problem of structure learning in graphical models.
author	Arnold, Matthew George
author_facet	Arnold, Matthew George
author_sort	Arnold, Matthew George
title	Statistical aspects of persistent homology
title_short	Statistical aspects of persistent homology
title_full	Statistical aspects of persistent homology
title_fullStr	Statistical aspects of persistent homology
title_full_unstemmed	Statistical aspects of persistent homology
title_sort	statistical aspects of persistent homology
publisher	University of Bristol
publishDate	2015
url	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.682184
work_keys_str_mv	AT arnoldmatthewgeorge statisticalaspectsofpersistenthomology
_version_	1718422999356407808

Statistical aspects of persistent homology

Similar Items