A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result

Cardinality estimation is an important component of query optimization. Its accuracy and efficiency directly decide effect of query optimization. Traditional cardinality estimation strategy is based on original table or sample to collect statistics, then inferring cardinality by collected statistics...

Full description

Bibliographic Details
Format: Article
Language:zho
Published: The Northwestern Polytechnical University 2018-08-01
Series:Xibei Gongye Daxue Xuebao
Subjects:
Online Access:https://www.jnwpu.org/articles/jnwpu/pdf/2018/04/jnwpu2018364p768.pdf
id doaj-7ad6fc9355a549e0bc1d7196bf9ebc3c
record_format Article
spelling doaj-7ad6fc9355a549e0bc1d7196bf9ebc3c2021-05-02T19:21:47ZzhoThe Northwestern Polytechnical UniversityXibei Gongye Daxue Xuebao1000-27582609-71252018-08-0136476877710.1051/jnwpu/20183640768jnwpu2018364p768A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result012School of Computer Science, Northwestern Polytechnical UniversitySchool of Computer Science, Northwestern Polytechnical UniversitySchool of Computer Science, Northwestern Polytechnical UniversityCardinality estimation is an important component of query optimization. Its accuracy and efficiency directly decide effect of query optimization. Traditional cardinality estimation strategy is based on original table or sample to collect statistics, then inferring cardinality by collected statistics. It will be low-efficiency when handling big data; Statistics exist update latency and are gotten by inferring, which can not guarantee correctness; Some strategies can get the actual cardinality by executing some subqueries, but they do not keep the result, leading to low efficiency of fetching statistics. Against these problems, this paper proposes a novel cardinality estimation strategy, called cardinality estimation based on query result(CEQR). For keeping correctness of cardinality, CEQR directly gets statistics from query results, which is not related with data size; we build a cardinality table to store the statistics of basic tables and middle results under specific predicates. Cardinality table can provide cardinality services for subsequent queries, and we build a suit of rules to maintain cardinality table; To improve the efficiency of fetching statistics, we introduce the source aware strategy, which hashes cardinality item to appropriate cache. This paper gives the adaptability and deviation analytic of CEQR, and proves that CEQR is more efficient than traditional cardinality estimation strategy by experiments.https://www.jnwpu.org/articles/jnwpu/pdf/2018/04/jnwpu2018364p768.pdfbig datacardinality estimationquery optimizationquery resultefficientaccurate
collection DOAJ
language zho
format Article
sources DOAJ
title A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result
spellingShingle A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result
Xibei Gongye Daxue Xuebao
big data
cardinality estimation
query optimization
query result
efficient
accurate
title_short A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result
title_full A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result
title_fullStr A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result
title_full_unstemmed A Strategy of Efficient and Accurate Cardinality Estimation Based on Query Result
title_sort strategy of efficient and accurate cardinality estimation based on query result
publisher The Northwestern Polytechnical University
series Xibei Gongye Daxue Xuebao
issn 1000-2758
2609-7125
publishDate 2018-08-01
description Cardinality estimation is an important component of query optimization. Its accuracy and efficiency directly decide effect of query optimization. Traditional cardinality estimation strategy is based on original table or sample to collect statistics, then inferring cardinality by collected statistics. It will be low-efficiency when handling big data; Statistics exist update latency and are gotten by inferring, which can not guarantee correctness; Some strategies can get the actual cardinality by executing some subqueries, but they do not keep the result, leading to low efficiency of fetching statistics. Against these problems, this paper proposes a novel cardinality estimation strategy, called cardinality estimation based on query result(CEQR). For keeping correctness of cardinality, CEQR directly gets statistics from query results, which is not related with data size; we build a cardinality table to store the statistics of basic tables and middle results under specific predicates. Cardinality table can provide cardinality services for subsequent queries, and we build a suit of rules to maintain cardinality table; To improve the efficiency of fetching statistics, we introduce the source aware strategy, which hashes cardinality item to appropriate cache. This paper gives the adaptability and deviation analytic of CEQR, and proves that CEQR is more efficient than traditional cardinality estimation strategy by experiments.
topic big data
cardinality estimation
query optimization
query result
efficient
accurate
url https://www.jnwpu.org/articles/jnwpu/pdf/2018/04/jnwpu2018364p768.pdf
_version_ 1721488430132625408