Optimization of complex SQL queries is still an active and challenging area of work. It is fair to say that except for a handful of researchers and practitioners, few have a good understanding of optimization techniques for queries that go beyond the Select-Project-Join class of queries. The challenges stem from identifying algebraic transformations that are correct, and significant with respect to spheres of applicability and potential performance gains. Published in VLDB 1987, this paper by Dayal was a bold attempt to cast much of the earlier work on complex queries involving nested sub-queries, aggregation and outerjoins in a unified algebraic framework. But, the paper was much more than mere consolidation of earlier ideas in optimization of complex queries. The paper sparkles with many insightful observations. Let me mention some of the insights that are my personal favorites. The paper observed that semi-joins, introduced earlier in the context of distributed databases, is an effective abstraction for optimizing nested sub-queries (and a broad class of existential queries) by reducing computation needed for different "parts" of the same query. This idea is very closely related to the "Magic-Sets" optimization for correlated SQL queries that was pursued a few years later. This was also the first paper that I read which mentioned examples where group-by and join operations may be commuted. Several researchers (myself included) later looked into this very topic more deeply. Finally, the paper also discussed how outerjoins and joins may be commuted. Subsequently, the thesis by Cesar Galindo-Legaria further advanced work on optimizing outerjoins and joins. The paper is not easy to read but there has been a lot of follow-up work that deepened some of the ideas that were presented in this paper. Nonetheless, this paper has plenty of insights to reward a reader handsomely. If you plan to work on complex query optimization, this is a must-read.
[Full Disclosure: The author was my manager from 1993-1995. But, my view on this paper did not change since 1989, when I had read the paper for the first time.]
Copyright © 2000 by the author(s). Review published with permission.