Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis

Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks...

Full description

Bibliographic Details
Main Author: Royer, Loic
Other Authors: Technische Universtiät Dresden, Fakultät Informatik
Format: Doctoral Thesis
Language:English
Published: Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden 2017
Subjects:
Y2H
Online Access:http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-62562
http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-62562
http://www.qucosa.de/fileadmin/data/qucosa/documents/6256/PhdThesis.pdf
id ndltd-DRESDEN-oai-qucosa.de-bsz-14-qucosa-62562
record_format oai_dc
spelling ndltd-DRESDEN-oai-qucosa.de-bsz-14-qucosa-625622017-12-13T03:28:27Z Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis Royer, Loic protein interaction networks Power graph analysis proteomics bioinformatics computational biology graph theory visualization network compression Y2H APMS miR-124 HIF-1 MELAS Sjogren syndrome ddc:570 rvk:WD 5100 Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks of interacting proteins. Molecular biologists hope to understand life's complex molecular machines by studying these networks. This thesis addresses tree open problems centered upon their analysis and quality assessment. First, we introduce power graph analysis as a novel approach to the representation and visualization of biological networks. Power graphs are a graph theoretic approach to lossless and compact representation of complex networks. It groups edges into cliques and bicliques, and nodes into a neighborhood hierarchy. We demonstrate power graph analysis on five examples, and show its advantages over traditional network representations. Moreover, we evaluate the algorithm performance on a benchmark, test the robustness of the algorithm to noise, and measure its empirical time complexity at O (e1.71)- sub-quadratic in the number of edges e. Second, we tackle the difficult and controversial problem of data quality in protein interaction networks. We propose a novel measure for accuracy and completeness of genome-wide protein interaction networks based on network compressibility. We validate this new measure by i) verifying the detrimental effect of false positives and false negatives, ii) showing that gold standard networks are highly compressible, iii) showing that authors' choice of confidence thresholds is consistent with high network compressibility, iv) presenting evidence that compressibility is correlated with co-expression, co-localization and shared function, v) showing that complete and accurate networks of complex systems in other domains exhibit similar levels of compressibility than current high quality interactomes. Third, we apply power graph analysis to networks derived from text-mining as well to gene expression microarray data. In particular, we present i) the network-based analysis of genome-wide expression profiles of the neuroectodermal conversion of mesenchymal stem cells. ii) the analysis of regulatory modules in a rare mitochondrial cytopathy: emph{Mitochondrial Encephalomyopathy, Lactic acidosis, and Stroke-like episodes} (MELAS), and iii) we investigate the biochemical causes behind the enhanced biocompatibility of tantalum compared with titanium. Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden Technische Universtiät Dresden, Fakultät Informatik Prof. Dr. Michael Schroeder Prof. Dr. Ralf Zimmer 2017-12-12 doc-type:doctoralThesis application/pdf http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-62562 urn:nbn:de:bsz:14-qucosa-62562 http://www.qucosa.de/fileadmin/data/qucosa/documents/6256/PhdThesis.pdf eng
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
topic protein interaction networks
Power graph analysis
proteomics
bioinformatics
computational biology
graph theory
visualization
network compression
Y2H
APMS
miR-124
HIF-1
MELAS
Sjogren syndrome
ddc:570
rvk:WD 5100
spellingShingle protein interaction networks
Power graph analysis
proteomics
bioinformatics
computational biology
graph theory
visualization
network compression
Y2H
APMS
miR-124
HIF-1
MELAS
Sjogren syndrome
ddc:570
rvk:WD 5100
Royer, Loic
Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis
description Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks of interacting proteins. Molecular biologists hope to understand life's complex molecular machines by studying these networks. This thesis addresses tree open problems centered upon their analysis and quality assessment. First, we introduce power graph analysis as a novel approach to the representation and visualization of biological networks. Power graphs are a graph theoretic approach to lossless and compact representation of complex networks. It groups edges into cliques and bicliques, and nodes into a neighborhood hierarchy. We demonstrate power graph analysis on five examples, and show its advantages over traditional network representations. Moreover, we evaluate the algorithm performance on a benchmark, test the robustness of the algorithm to noise, and measure its empirical time complexity at O (e1.71)- sub-quadratic in the number of edges e. Second, we tackle the difficult and controversial problem of data quality in protein interaction networks. We propose a novel measure for accuracy and completeness of genome-wide protein interaction networks based on network compressibility. We validate this new measure by i) verifying the detrimental effect of false positives and false negatives, ii) showing that gold standard networks are highly compressible, iii) showing that authors' choice of confidence thresholds is consistent with high network compressibility, iv) presenting evidence that compressibility is correlated with co-expression, co-localization and shared function, v) showing that complete and accurate networks of complex systems in other domains exhibit similar levels of compressibility than current high quality interactomes. Third, we apply power graph analysis to networks derived from text-mining as well to gene expression microarray data. In particular, we present i) the network-based analysis of genome-wide expression profiles of the neuroectodermal conversion of mesenchymal stem cells. ii) the analysis of regulatory modules in a rare mitochondrial cytopathy: emph{Mitochondrial Encephalomyopathy, Lactic acidosis, and Stroke-like episodes} (MELAS), and iii) we investigate the biochemical causes behind the enhanced biocompatibility of tantalum compared with titanium.
author2 Technische Universtiät Dresden, Fakultät Informatik
author_facet Technische Universtiät Dresden, Fakultät Informatik
Royer, Loic
author Royer, Loic
author_sort Royer, Loic
title Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis
title_short Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis
title_full Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis
title_fullStr Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis
title_full_unstemmed Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis
title_sort unraveling the structure and assessing the quality of protein interaction networks with power graph analysis
publisher Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden
publishDate 2017
url http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-62562
http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-62562
http://www.qucosa.de/fileadmin/data/qucosa/documents/6256/PhdThesis.pdf
work_keys_str_mv AT royerloic unravelingthestructureandassessingthequalityofproteininteractionnetworkswithpowergraphanalysis
_version_ 1718563768057724928