Why-Query Support in Graph Databases

In the last few decades, database management systems became powerful tools for storing large amount of data and executing complex queries over them. In addition to extended functionality, novel types of databases appear like triple stores, distributed databases, etc. Graph databases implementing the...

Full description

Bibliographic Details
Main Author:	Vasilyeva, Elena
Other Authors:	Technische Universität Dresden, Fakultät Informatik
Format:	Doctoral Thesis
Language:	English
Published:	Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden 2017
Subjects:	Graph Datenbanken Anfragebearbeitung Graph databases pattern matching empty-answer problem why-queries ddc:004 rvk:ST 265 rvk:ST 270
Online Access:	http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-221730 http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-221730 http://www.qucosa.de/fileadmin/data/qucosa/documents/22173/thesis.pdf

id	ndltd-DRESDEN-oai-qucosa.de-bsz-14-qucosa-221730
record_format	oai_dc
collection	NDLTD
language	English
format	Doctoral Thesis
sources	NDLTD
topic	Graph Datenbanken Anfragebearbeitung Graph databases pattern matching empty-answer problem why-queries ddc:004 rvk:ST 265 rvk:ST 270
spellingShingle	Graph Datenbanken Anfragebearbeitung Graph databases pattern matching empty-answer problem why-queries ddc:004 rvk:ST 265 rvk:ST 270 Vasilyeva, Elena Why-Query Support in Graph Databases
description	In the last few decades, database management systems became powerful tools for storing large amount of data and executing complex queries over them. In addition to extended functionality, novel types of databases appear like triple stores, distributed databases, etc. Graph databases implementing the property-graph model belong to this development branch and provide a new way for storing and processing data in the form of a graph with nodes representing some entities and edges describing connections between them. This consideration makes them suitable for keeping data without a rigid schema for use cases like social-network processing or data integration. In addition to a flexible storage, graph databases provide new querying possibilities in the form of path queries, detection of connected components, pattern matching, etc. However, the schema flexibility and graph queries come with additional costs. With limited knowledge about data and little experience in constructing the complex queries, users can create such ones, which deliver unexpected results. Forced to debug queries manually and overwhelmed by the amount of query constraints, users can get frustrated by using graph databases. What is really needed, is to improve usability of graph databases by providing debugging and explaining functionality for such situations. We have to assist users in the discovery of what were the reasons of unexpected results and what can be done in order to fix them. The unexpectedness of result sets can be expressed in terms of their size or content. In the first case, users have to solve the empty-answer, too-many-, or too-few-answers problems. In the second case, users care about the result content and miss some expected answers or wonder about presence of some unexpected ones. Considering the typical problems of receiving no or too many results by querying graph databases, in this thesis we focus on investigating the problems of the first group, whose solutions are usually represented by why-empty, why-so-few, and why-so-many queries. Our objective is to extend graph databases with debugging functionality in the form of why-queries for unexpected query results on the example of pattern matching queries, which are one of general graph-query types. We present a comprehensive analysis of existing debugging tools in the state-of-the-art research and identify their common properties. From them, we formulate the following features of why-queries, which we discuss in this thesis, namely: holistic support of different cardinality-based problems, explanation of unexpected results and query reformulation, comprehensive analysis of explanations, and non-intrusive user integration. To support different cardinality-based problems, we develop methods for explaining no, too few, and too many results. To cover different kinds of explanations, we present two types: subgraph- and modification-based explanations. The first type identifies the reasons of unexpectedness in terms of query subgraphs and delivers differential graphs as answers. The second one reformulates queries in such a way that they produce better results. Considering graph queries to be complex structures with multiple constraints, we investigate different ways of generating explanations starting from the most general one that considers only a query topology through coarse-grained rewriting up to fine-grained modification that allows fine changes of predicates and topology. To provide a comprehensive analysis of explanations, we propose to compare them on three levels including a syntactic description, a content, and a size of a result set. In order to deliver user-aware explanations, we discuss two models for non-intrusive user integration in the generation process. With the techniques proposed in this thesis, we are able to provide fundamentals for debugging of pattern-matching queries, which deliver no, too few, or too many results, in graph databases implementing the property-graph model.
author2	Technische Universität Dresden, Fakultät Informatik
author_facet	Technische Universität Dresden, Fakultät Informatik Vasilyeva, Elena
author	Vasilyeva, Elena
author_sort	Vasilyeva, Elena
title	Why-Query Support in Graph Databases
title_short	Why-Query Support in Graph Databases
title_full	Why-Query Support in Graph Databases
title_fullStr	Why-Query Support in Graph Databases
title_full_unstemmed	Why-Query Support in Graph Databases
title_sort	why-query support in graph databases
publisher	Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden
publishDate	2017
url	http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-221730 http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-221730 http://www.qucosa.de/fileadmin/data/qucosa/documents/22173/thesis.pdf
work_keys_str_mv	AT vasilyevaelena whyquerysupportingraphdatabases
_version_	1718434792218820608
spelling	ndltd-DRESDEN-oai-qucosa.de-bsz-14-qucosa-2217302017-03-29T03:37:35Z Why-Query Support in Graph Databases Vasilyeva, Elena Graph Datenbanken Anfragebearbeitung Graph databases pattern matching empty-answer problem why-queries ddc:004 rvk:ST 265 rvk:ST 270 In the last few decades, database management systems became powerful tools for storing large amount of data and executing complex queries over them. In addition to extended functionality, novel types of databases appear like triple stores, distributed databases, etc. Graph databases implementing the property-graph model belong to this development branch and provide a new way for storing and processing data in the form of a graph with nodes representing some entities and edges describing connections between them. This consideration makes them suitable for keeping data without a rigid schema for use cases like social-network processing or data integration. In addition to a flexible storage, graph databases provide new querying possibilities in the form of path queries, detection of connected components, pattern matching, etc. However, the schema flexibility and graph queries come with additional costs. With limited knowledge about data and little experience in constructing the complex queries, users can create such ones, which deliver unexpected results. Forced to debug queries manually and overwhelmed by the amount of query constraints, users can get frustrated by using graph databases. What is really needed, is to improve usability of graph databases by providing debugging and explaining functionality for such situations. We have to assist users in the discovery of what were the reasons of unexpected results and what can be done in order to fix them. The unexpectedness of result sets can be expressed in terms of their size or content. In the first case, users have to solve the empty-answer, too-many-, or too-few-answers problems. In the second case, users care about the result content and miss some expected answers or wonder about presence of some unexpected ones. Considering the typical problems of receiving no or too many results by querying graph databases, in this thesis we focus on investigating the problems of the first group, whose solutions are usually represented by why-empty, why-so-few, and why-so-many queries. Our objective is to extend graph databases with debugging functionality in the form of why-queries for unexpected query results on the example of pattern matching queries, which are one of general graph-query types. We present a comprehensive analysis of existing debugging tools in the state-of-the-art research and identify their common properties. From them, we formulate the following features of why-queries, which we discuss in this thesis, namely: holistic support of different cardinality-based problems, explanation of unexpected results and query reformulation, comprehensive analysis of explanations, and non-intrusive user integration. To support different cardinality-based problems, we develop methods for explaining no, too few, and too many results. To cover different kinds of explanations, we present two types: subgraph- and modification-based explanations. The first type identifies the reasons of unexpectedness in terms of query subgraphs and delivers differential graphs as answers. The second one reformulates queries in such a way that they produce better results. Considering graph queries to be complex structures with multiple constraints, we investigate different ways of generating explanations starting from the most general one that considers only a query topology through coarse-grained rewriting up to fine-grained modification that allows fine changes of predicates and topology. To provide a comprehensive analysis of explanations, we propose to compare them on three levels including a syntactic description, a content, and a size of a result set. In order to deliver user-aware explanations, we discuss two models for non-intrusive user integration in the generation process. With the techniques proposed in this thesis, we are able to provide fundamentals for debugging of pattern-matching queries, which deliver no, too few, or too many results, in graph databases implementing the property-graph model. Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden Technische Universität Dresden, Fakultät Informatik Prof. Dr.-Ing. Wolfgang Lehner Associate Prof. Dr.-Ing. Katja Hose 2017-03-28 doc-type:doctoralThesis application/pdf http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-221730 urn:nbn:de:bsz:14-qucosa-221730 http://www.qucosa.de/fileadmin/data/qucosa/documents/22173/thesis.pdf eng

Why-Query Support in Graph Databases

Similar Items