A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result

Cardinality estimation is an important component of query optimization. Its accuracy and efficiency directly decide effect of query optimization. Traditional cardinality estimation strategy is based on original table or sample to collect statistics, then inferring cardinality by collected statistics...

Full description

Bibliographic Details
Format:	Article
Language:	zho
Published:	The Northwestern Polytechnical University 2018-08-01
Series:	Xibei Gongye Daxue Xuebao
Subjects:	big data cardinality estimation query optimization query result efficient accurate
Online Access:	https://www.jnwpu.org/articles/jnwpu/pdf/2018/04/jnwpu2018364p768.pdf

id	doaj-7ad6fc9355a549e0bc1d7196bf9ebc3c
record_format	Article
spelling	doaj-7ad6fc9355a549e0bc1d7196bf9ebc3c2021-05-02T19:21:47ZzhoThe Northwestern Polytechnical UniversityXibei Gongye Daxue Xuebao1000-27582609-71252018-08-0136476877710.1051/jnwpu/20183640768jnwpu2018364p768A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result012School of Computer Science, Northwestern Polytechnical UniversitySchool of Computer Science, Northwestern Polytechnical UniversitySchool of Computer Science, Northwestern Polytechnical UniversityCardinality estimation is an important component of query optimization. Its accuracy and efficiency directly decide effect of query optimization. Traditional cardinality estimation strategy is based on original table or sample to collect statistics, then inferring cardinality by collected statistics. It will be low-efficiency when handling big data; Statistics exist update latency and are gotten by inferring, which can not guarantee correctness; Some strategies can get the actual cardinality by executing some subqueries, but they do not keep the result, leading to low efficiency of fetching statistics. Against these problems, this paper proposes a novel cardinality estimation strategy, called cardinality estimation based on query result(CEQR). For keeping correctness of cardinality, CEQR directly gets statistics from query results, which is not related with data size; we build a cardinality table to store the statistics of basic tables and middle results under specific predicates. Cardinality table can provide cardinality services for subsequent queries, and we build a suit of rules to maintain cardinality table; To improve the efficiency of fetching statistics, we introduce the source aware strategy, which hashes cardinality item to appropriate cache. This paper gives the adaptability and deviation analytic of CEQR, and proves that CEQR is more efficient than traditional cardinality estimation strategy by experiments.https://www.jnwpu.org/articles/jnwpu/pdf/2018/04/jnwpu2018364p768.pdfbig datacardinality estimationquery optimizationquery resultefficientaccurate
collection	DOAJ
language	zho
format	Article
sources	DOAJ
title	A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result
spellingShingle	A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result Xibei Gongye Daxue Xuebao big data cardinality estimation query optimization query result efficient accurate
title_short	A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result
title_full	A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result
title_fullStr	A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result
title_full_unstemmed	A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result
title_sort	strategy of efficient and accurate cardinality estimation based on query result
publisher	The Northwestern Polytechnical University
series	Xibei Gongye Daxue Xuebao
issn	1000-2758 2609-7125
publishDate	2018-08-01
description	Cardinality estimation is an important component of query optimization. Its accuracy and efficiency directly decide effect of query optimization. Traditional cardinality estimation strategy is based on original table or sample to collect statistics, then inferring cardinality by collected statistics. It will be low-efficiency when handling big data; Statistics exist update latency and are gotten by inferring, which can not guarantee correctness; Some strategies can get the actual cardinality by executing some subqueries, but they do not keep the result, leading to low efficiency of fetching statistics. Against these problems, this paper proposes a novel cardinality estimation strategy, called cardinality estimation based on query result(CEQR). For keeping correctness of cardinality, CEQR directly gets statistics from query results, which is not related with data size; we build a cardinality table to store the statistics of basic tables and middle results under specific predicates. Cardinality table can provide cardinality services for subsequent queries, and we build a suit of rules to maintain cardinality table; To improve the efficiency of fetching statistics, we introduce the source aware strategy, which hashes cardinality item to appropriate cache. This paper gives the adaptability and deviation analytic of CEQR, and proves that CEQR is more efficient than traditional cardinality estimation strategy by experiments.
topic	big data cardinality estimation query optimization query result efficient accurate
url	https://www.jnwpu.org/articles/jnwpu/pdf/2018/04/jnwpu2018364p768.pdf
_version_	1721488430132625408

A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result

Similar Items