Performing Group-By before Join.

Weipeng P. Yan, Per-Åke Larson: Performing Group-By before Join. ICDE 1994: 89-100

@inproceedings{DBLP:conf/icde/YanL94,
  author    = {Weipeng P. Yan and
               Per-{\AA}ke Larson},
  title     = {Performing Group-By before Join},
  booktitle = {Proceedings of the Tenth International Conference on Data Engineering,
               February 14-18, 1994, Houston, Texas, USA},
  publisher = {IEEE Computer Society},
  year      = {1994},
  isbn      = {0-8186-5400-7},
  pages     = {89-100},
  ee        = {db/conf/icde/YanL94.html},
  crossref  = {DBLP:conf/icde/94},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

Assume that we have an SQL query containing joins and a group-by. The standard way of evaluating this type of query is to first perform all the joins and then the group-by operation. However, it may be possible to perform the group-by early, that is, to push the group-by operation past one or more joins. Early grouping may reduce the query processing cost by reducing the amount of data participating in joins. We formally define the problem, adhering strictly to the semantics of NULL and duplicate elimination in SQL2, and prove necessary and sufficient conditions for deciding when this transformation is valid. In practice, it may be expensive or even impossible to test whether the conditions are satisfied. Therefore, we also present a more practical algorithm that tests a simpler, sufficient condition. This algorithm is fast and detects a large subclass of transformable queries.

Keywords: query transformation, query rewrite, SQL, query optimization, group-by, join.

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 6, ICDE 1984-1995" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the CD and click on the path of your mount point:
/Anthology/An2-6 or /cdrom

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the DVD and click on the path of your mount point:
/Anthology/aDVD1 or /dvd

Printed Edition

Proceedings of the Tenth International Conference on Data Engineering, February 14-18, 1994, Houston, Texas, USA. IEEE Computer Society 1994, ISBN 0-8186-5400-7
Contents

References

[1]: C. J. Date, Hugh Darwen: A Guide to SQL Standard, 3rd Edition. Addison-Wesley 1993, ISBN 0-201-55822-X
[2]: Umeshwar Dayal: Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers. VLDB 1987: 197-208
[3]: Richard A. Ganski, Harry K. T. Wong: Optimization of Nested SQL Queries Revisited. SIGMOD Conference 1987: 23-33
[4]: ...
[5]: Werner Kießling: On Semantic Reefs and Efficient Processing of Correlation Queries with Aggregates. VLDB 1985: 241-250
[6]: Won Kim: On Optimizing an SQL-like Nested Query. ACM Trans. Database Syst. 7(3): 443-469(1982)
[7]: Anthony C. Klug: Access Paths in the 'ABE' Statistical Query Facility. SIGMOD Conference 1982: 161-173
[8]: Jim Melton, Alan R. Simon: Understanding the New SQL: A Complete Guide. Morgan Kaufmann 1993, ISBN 1-55860-245-3
Contents
[9]: Mauro Negri, Giuseppe Pelagatti, Licia Sbattella: Formal Semantics of SQL Queries. ACM Trans. Database Syst. 16(3): 513-534(1991)
[10]: Günter von Bültzingsloewen: Translating and Optimizing SQL Queries Having Aggregates. VLDB 1987: 235-243
[11]: ...