Scalable Algorithms for Mining Maximal Quasi Frequent Subnetworks

Frequent graph mining has received considerable attention from researchers. Existing algorithms for frequent subgraph mining do not scale for large networks, and take hours to finish. Mining multiple gene coexpressions networks allows for identifying context-specific modules. Frequent subnetworks re...

Full description

Bibliographic Details
Main Author: El Radie, Eihab Salah
Format: Others
Published: North Dakota State University 2018
Online Access:https://hdl.handle.net/10365/28735
id ndltd-ndsu.edu-oai-library.ndsu.edu-10365-28735
record_format oai_dc
spelling ndltd-ndsu.edu-oai-library.ndsu.edu-10365-287352021-09-28T17:11:37Z Scalable Algorithms for Mining Maximal Quasi Frequent Subnetworks El Radie, Eihab Salah Frequent graph mining has received considerable attention from researchers. Existing algorithms for frequent subgraph mining do not scale for large networks, and take hours to finish. Mining multiple gene coexpressions networks allows for identifying context-specific modules. Frequent subnetworks represent essential biological modules. In this thesis, we propose two algorithms for mining frequent subgraphs. In the first algorithm, we propose a parallel algorithm for mining maximal frequent subgraphs from gene coexpression networks. Despite the algorithm’s parallelization, it takes much time and it does not allow relaxation. This inspired us to develop a second algorithm that solves those problems. In the second algorithm, we propose a greedy approach for mining approximate frequent subgraphs. Experiments on real tissue-specific RNA-seq expression networks and synthetic data demonstrate the effectiveness of the proposed algorithms. Moreover, biological enrichment analysis shows that the reported patterns are biologically relevant and enriched with known biological processes and KEGG pathways. 2018-07-30T19:12:06Z 2018-07-30T19:12:06Z 2018 text/thesis https://hdl.handle.net/10365/28735 NDSU Policy 190.6.2 https://www.ndsu.edu/fileadmin/policy/190.pdf application/pdf North Dakota State University
collection NDLTD
format Others
sources NDLTD
description Frequent graph mining has received considerable attention from researchers. Existing algorithms for frequent subgraph mining do not scale for large networks, and take hours to finish. Mining multiple gene coexpressions networks allows for identifying context-specific modules. Frequent subnetworks represent essential biological modules. In this thesis, we propose two algorithms for mining frequent subgraphs. In the first algorithm, we propose a parallel algorithm for mining maximal frequent subgraphs from gene coexpression networks. Despite the algorithm’s parallelization, it takes much time and it does not allow relaxation. This inspired us to develop a second algorithm that solves those problems. In the second algorithm, we propose a greedy approach for mining approximate frequent subgraphs. Experiments on real tissue-specific RNA-seq expression networks and synthetic data demonstrate the effectiveness of the proposed algorithms. Moreover, biological enrichment analysis shows that the reported patterns are biologically relevant and enriched with known biological processes and KEGG pathways.
author El Radie, Eihab Salah
spellingShingle El Radie, Eihab Salah
Scalable Algorithms for Mining Maximal Quasi Frequent Subnetworks
author_facet El Radie, Eihab Salah
author_sort El Radie, Eihab Salah
title Scalable Algorithms for Mining Maximal Quasi Frequent Subnetworks
title_short Scalable Algorithms for Mining Maximal Quasi Frequent Subnetworks
title_full Scalable Algorithms for Mining Maximal Quasi Frequent Subnetworks
title_fullStr Scalable Algorithms for Mining Maximal Quasi Frequent Subnetworks
title_full_unstemmed Scalable Algorithms for Mining Maximal Quasi Frequent Subnetworks
title_sort scalable algorithms for mining maximal quasi frequent subnetworks
publisher North Dakota State University
publishDate 2018
url https://hdl.handle.net/10365/28735
work_keys_str_mv AT elradieeihabsalah scalablealgorithmsforminingmaximalquasifrequentsubnetworks
_version_ 1719485715522256896