ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Selectivity Estimation and Query Optimization in Large Databases with Highly Skewed Distribution of Column Values.

Clifford A. Lynch: Selectivity Estimation and Query Optimization in Large Databases with Highly Skewed Distribution of Column Values. VLDB 1988: 240-251
@inproceedings{DBLP:conf/vldb/Lynch88,
  author    = {Clifford A. Lynch},
  editor    = {Fran\c{c}ois Bancilhon and
               David J. DeWitt},
  title     = {Selectivity Estimation and Query Optimization in Large Databases
               with Highly Skewed Distribution of Column Values},
  booktitle = {Fourteenth International Conference on Very Large Data Bases,
               August 29 - September 1, 1988, Los Angeles, California, USA,
               Proceedings},
  publisher = {Morgan Kaufmann},
  year      = {1988},
  isbn      = {0-934613-75-3},
  pages     = {240-251},
  ee        = {db/conf/vldb/Lynch88.html},
  crossref  = {DBLP:conf/vldb/88},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

When column values in a large database follow highly skewed distributions (such as Zipf distributions, typically found in textual databases), qnery optimizers in current relational systems often fail to choose optimal query plans even for simple single-relation queries. The major cause of these optimization failures is incorrect predicate selectivity estimation; the likelihood and cost of such errors are quantified. A scheme for adding userdefined selectivity estimators to a relational DBMS is proposed. The paper defines a series of new selectivity estimation methods that work well with highly skewed value distributions and then compares them to currently used methods such as uniform approximation and histograms. Empirical data from a large bibliographic database is used throughout the analyses in this paper.

Copyright © 1988 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 4, VLDB '75-'88" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

François Bancilhon, David J. DeWitt (Eds.): Fourteenth International Conference on Very Large Data Bases, August 29 - September 1, 1988, Los Angeles, California, USA, Proceedings. Morgan Kaufmann 1988, ISBN 0-934613-75-3
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

References

[Bendat & Piersol 1971]
...
[Christodoulakis 1981]
...
[DLA 1987]
...
[Fedorowicz 1981]
...
[Kahn 1967]
...
[Knuth 1968]
Donald E. Knuth: The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition. Addison-Wesley 1973
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Knuth 1973]
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Kooi 1980]
Robert Kooi: The Optimization of Queries in Relational Databases. Ph.D. thesis, Case Western Reserve University 1980
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Lynch & Stonebraker 1988]
Clifford A. Lynch, Michael Stonebraker: Extended User-Defined Indexing with Application to Textual Databases. VLDB 1988: 306-317 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Lynch 1987]
...
[Piatetsky-Shapiro & Connell 1984]
Gregory Piatetsky-Shapiro, Charles Connell: Accurate Estimation of the Number of Tuples Satisfying a Condition. SIGMOD Conference 1984: 256-276 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[RTI 1986]
...
[Selinger et al. 1979]
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979: 23-34 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Stonebraker et al. 1976]
Michael Stonebraker, Eugene Wong, Peter Kreps, Gerald Held: The Design and Implementation of INGRES. ACM Trans. Database Syst. 1(3): 189-222(1976) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Stonebraker 1986]
Michael Stonebraker: Inclusion of New Types in Relational Data Base Systems. ICDE 1986: 262-269 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Stonebraker & Rowe 1985]
Michael Stonebraker, Lawrence A. Rowe: The Design of Postgres. SIGMOD Conference 1986: 340-355 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Zaniolo 1983]
Carlo Zaniolo: The Database Language GEM. SIGMOD Conference 1983: 207-218 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Copyright © Tue Mar 16 02:21:59 2010 by Michael Ley (ley@uni-trier.de)