GRAMI: Generalized Frequent Subgraph Mining in Large Graphs

Mining frequent subgraphs is an important operation on graphs. Most existing work assumes a database of many small graphs, but modern applications, such as social networks, citation graphs or protein-protein interaction in bioinformatics, are modeled as a single large graph. Interesting interactions...

Full description

Bibliographic Details
Main Author: El Saeedy, Mohammed El Sayed
Other Authors: Kalnis, Panos
Language:en
Published: 2012
Online Access:El Saeedy, M. E. S. (2011). GRAMI: Generalized Frequent Subgraph Mining in Large Graphs. KAUST Research Repository. https://doi.org/10.25781/KAUST-G70TD
http://hdl.handle.net/10754/209372
id ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-209372
record_format oai_dc
spelling ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-2093722021-09-15T05:06:42Z GRAMI: Generalized Frequent Subgraph Mining in Large Graphs El Saeedy, Mohammed El Sayed Kalnis, Panos Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division Gao, Xin Ravasi, Timothy Mining frequent subgraphs is an important operation on graphs. Most existing work assumes a database of many small graphs, but modern applications, such as social networks, citation graphs or protein-protein interaction in bioinformatics, are modeled as a single large graph. Interesting interactions in such applications may be transitive (e.g., friend of a friend). Existing methods, however, search for frequent isomorphic (i.e., exact match) subgraphs and cannot discover many useful patterns. In this paper we propose GRAMI, a framework that generalizes frequent subgraph mining in a large single graph. GRAMI discovers frequent patterns. A pattern is a graph where edges are generalized to distance-constrained paths. Depending on the definition of the distance function, many instantiations of the framework are possible. Both directed and undirected graphs, as well as multiple labels per vertex, are supported. We developed an efficient implementation of the framework that models the frequency resolution phase as a constraint satisfaction problem, in order to avoid the costly enumeration of all instances of each pattern in the graph. We also implemented CGRAMI, a version that supports structural and semantic constraints; and AGRAMI, an approximate version that supports very large graphs. Our experiments on real data demonstrate that our framework is up to 3 orders of magnitude faster and discovers more interesting patterns than existing approaches. 2012-02-04T08:11:42Z 2012-02-04T08:11:42Z 2011-07-24 Thesis El Saeedy, M. E. S. (2011). GRAMI: Generalized Frequent Subgraph Mining in Large Graphs. KAUST Research Repository. https://doi.org/10.25781/KAUST-G70TD 10.25781/KAUST-G70TD http://hdl.handle.net/10754/209372 en
collection NDLTD
language en
sources NDLTD
description Mining frequent subgraphs is an important operation on graphs. Most existing work assumes a database of many small graphs, but modern applications, such as social networks, citation graphs or protein-protein interaction in bioinformatics, are modeled as a single large graph. Interesting interactions in such applications may be transitive (e.g., friend of a friend). Existing methods, however, search for frequent isomorphic (i.e., exact match) subgraphs and cannot discover many useful patterns. In this paper we propose GRAMI, a framework that generalizes frequent subgraph mining in a large single graph. GRAMI discovers frequent patterns. A pattern is a graph where edges are generalized to distance-constrained paths. Depending on the definition of the distance function, many instantiations of the framework are possible. Both directed and undirected graphs, as well as multiple labels per vertex, are supported. We developed an efficient implementation of the framework that models the frequency resolution phase as a constraint satisfaction problem, in order to avoid the costly enumeration of all instances of each pattern in the graph. We also implemented CGRAMI, a version that supports structural and semantic constraints; and AGRAMI, an approximate version that supports very large graphs. Our experiments on real data demonstrate that our framework is up to 3 orders of magnitude faster and discovers more interesting patterns than existing approaches.
author2 Kalnis, Panos
author_facet Kalnis, Panos
El Saeedy, Mohammed El Sayed
author El Saeedy, Mohammed El Sayed
spellingShingle El Saeedy, Mohammed El Sayed
GRAMI: Generalized Frequent Subgraph Mining in Large Graphs
author_sort El Saeedy, Mohammed El Sayed
title GRAMI: Generalized Frequent Subgraph Mining in Large Graphs
title_short GRAMI: Generalized Frequent Subgraph Mining in Large Graphs
title_full GRAMI: Generalized Frequent Subgraph Mining in Large Graphs
title_fullStr GRAMI: Generalized Frequent Subgraph Mining in Large Graphs
title_full_unstemmed GRAMI: Generalized Frequent Subgraph Mining in Large Graphs
title_sort grami: generalized frequent subgraph mining in large graphs
publishDate 2012
url El Saeedy, M. E. S. (2011). GRAMI: Generalized Frequent Subgraph Mining in Large Graphs. KAUST Research Repository. https://doi.org/10.25781/KAUST-G70TD
http://hdl.handle.net/10754/209372
work_keys_str_mv AT elsaeedymohammedelsayed gramigeneralizedfrequentsubgraphmininginlargegraphs
_version_ 1719480881449533440