Statistical Analysis of Protein Ensembles

As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology,...

Full description

Bibliographic Details
Main Authors: Gabriell eMáté, Dieter W Heermann
Format: Article
Language:English
Published: Frontiers Media S.A. 2014-04-01
Series:Frontiers in Physics
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fphy.2014.00020/full
id doaj-b38ed762625040eabddb9182cedbd37d
record_format Article
spelling doaj-b38ed762625040eabddb9182cedbd37d2020-11-24T23:03:43ZengFrontiers Media S.A.Frontiers in Physics2296-424X2014-04-01210.3389/fphy.2014.0002087770Statistical Analysis of Protein EnsemblesGabriell eMáté0Dieter W Heermann1Heidelberg UniversityHeidelberg UniversityAs 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.http://journal.frontiersin.org/Journal/10.3389/fphy.2014.00020/fulltopologytopological featurestopological similarityWasserstein distancestatistical comparison
collection DOAJ
language English
format Article
sources DOAJ
author Gabriell eMáté
Dieter W Heermann
spellingShingle Gabriell eMáté
Dieter W Heermann
Statistical Analysis of Protein Ensembles
Frontiers in Physics
topology
topological features
topological similarity
Wasserstein distance
statistical comparison
author_facet Gabriell eMáté
Dieter W Heermann
author_sort Gabriell eMáté
title Statistical Analysis of Protein Ensembles
title_short Statistical Analysis of Protein Ensembles
title_full Statistical Analysis of Protein Ensembles
title_fullStr Statistical Analysis of Protein Ensembles
title_full_unstemmed Statistical Analysis of Protein Ensembles
title_sort statistical analysis of protein ensembles
publisher Frontiers Media S.A.
series Frontiers in Physics
issn 2296-424X
publishDate 2014-04-01
description As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.
topic topology
topological features
topological similarity
Wasserstein distance
statistical comparison
url http://journal.frontiersin.org/Journal/10.3389/fphy.2014.00020/full
work_keys_str_mv AT gabriellemate statisticalanalysisofproteinensembles
AT dieterwheermann statisticalanalysisofproteinensembles
_version_ 1725632493080543232