![]() |
![]() |
![]() |
@inproceedings{DBLP:conf/icde/YanL94,
author = {Weipeng P. Yan and
Per-{\AA}ke Larson},
title = {Performing Group-By before Join},
booktitle = {Proceedings of the Tenth International Conference on Data Engineering,
February 14-18, 1994, Houston, Texas, USA},
publisher = {IEEE Computer Society},
year = {1994},
isbn = {0-8186-5400-7},
pages = {89-100},
ee = {db/conf/icde/YanL94.html},
crossref = {DBLP:conf/icde/94},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Assume that we have an SQL query containing joins and a group-by. The standard way of evaluating this type of query is to first perform all the joins and then the group-by operation. However, it may be possible to perform the group-by early, that is, to push the group-by operation past one or more joins. Early grouping may reduce the query processing cost by reducing the amount of data participating in joins. We formally define the problem, adhering strictly to the semantics of NULL and duplicate elimination in SQL2, and prove necessary and sufficient conditions for deciding when this transformation is valid. In practice, it may be expensive or even impossible to test whether the conditions are satisfied. Therefore, we also present a more practical algorithm that tests a simpler, sufficient condition. This algorithm is fast and detects a large subclass of transformable queries.
Keywords: query transformation, query rewrite, SQL, query optimization, group-by, join.
Copyright © 1994 by The Institute of Electrical and Electronic Engineers, Inc. (IEEE). Abstract used with permission.