CIDR Proceedings

Leveraging Query Optimizers to Verify the Soundness of LLM-based Query Rewrites for Real-World Workloads, and More

Authors:

Vivek Narasayya, Surajit Chaudhuri

Abstract

Query rewriting is one of the techniques used by application developers and DBAs to tune poorly performing queries. Recently, LLM-based query rewriting techniques have been proposed, and these show significant performance improvements on industry benchmark queries. We conduct an extensive empirical evaluation on Microsoft SQL Server using real-world queries on private enterprise databases in order to quantify the effectiveness of LLM-based query rewriting. We find that LLM-based rewriting shows promise even in real-world queries. However, since LLMs cannot guarantee semantic equivalence of the rewrite, checking if the rewritten query is indeed equivalent is a major impediment that limits the practical adoption of LLM-based rewriting today. We present a sound and efficient technique that leverages built-in capabilities of the query optimizer to verify semantic equivalence. Finally, we observe that LLM-based rewrites can be a source from which query optimizer developers can identify candidate transformation rules which are missing from their optimizer. We present a few such rules we identified based on our experiences with LLM-based rewrites for real-world and benchmark queries in Microsoft SQL Server.