ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Random Sampling from B+ Trees.

Frank Olken, Doron Rotem: Random Sampling from B+ Trees. VLDB 1989: 269-277
@inproceedings{DBLP:conf/vldb/OlkenR89,
  author    = {Frank Olken and
               Doron Rotem},
  editor    = {Peter M. G. Apers and
               Gio Wiederhold},
  title     = {Random Sampling from B+ Trees},
  booktitle = {Proceedings of the Fifteenth International Conference on Very
               Large Data Bases, August 22-25, 1989, Amsterdam, The Netherlands},
  publisher = {Morgan Kaufmann},
  year      = {1989},
  isbn      = {1-55860-101-5},
  pages     = {269-277},
  ee        = {db/conf/vldb/OlkenR89.html},
  crossref  = {DBLP:conf/vldb/89},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

We consider the design and analysis of algorithms to retrieve simple random samples from databases. Specifically, we examine simple random sampling from B+ treefiles. Existing methods of sampling from B+ trees, require the use of auxiliary rank information in the nodes of the tree. Such modified B+ tree files are called "ranked B+trees". We compare sampling from ranked B+ tree files, with new acceptance/rejection (A/R) sampling methods which sample directly from standard B+ trees. Our new A/R sampling algorithm can easily be retrofit to existing DBMSs, and does not require the overhead of maintaining rank information. We consider both iterative and batch sampling methods.

Copyright © 1989 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

Peter M. G. Apers, Gio Wiederhold (Eds.): Proceedings of the Fifteenth International Conference on Very Large Data Bases, August 22-25, 1989, Amsterdam, The Netherlands. Morgan Kaufmann 1989, ISBN 1-55860-101-5
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

References

[Ark84]
...
[BK75]
...
[Coc77]
William G. Cochran: Sampling Techniques, 3rd Edition. John Wiley 1977, ISBN 0-471-16240-X
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[EN82]
Jarmo Ernvall, Olli Nevalainen: An Algorithm for Unbiased Random Sampling. Comput. J. 25(1): 45-47(1982) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[FMR62]
...
[Gho86]
Sakti P. Ghosh: SIAM: statistics information access method. Inf. Syst. 13(4): 359-368(1988) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[HOT88]
Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Statistical Estimators for Relational Algebra Expressions. PODS 1988: 276-287 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Knu73]
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[LTA79]
...
[LWW84]
...
[Mon85]
...
[Pal85]
Prashant Palvia: Expressions for Batched Searching of Sequential and Hierarchical Files. ACM Trans. Database Syst. 10(1): 97-106(1985) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[SL88]
Jaideep Srivastava, Vincent Y. Lum: A Tree Based Access Method (TBSAM) for Fast Processing of Aggregate Queries. ICDE 1988: 504-510 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Vit84]
Jeffrey Scott Vitter: Faster Methods for Random Sampling. Commun. ACM 27(7): 703-718(1984) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Vit85]
Jeffrey Scott Vitter: Random Sampling with a Reservoir. ACM Trans. Math. Softw. 11(1): 37-57(1985) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[WE80]
C. K. Wong, Malcolm C. Easton: An Efficient Method for Weighted Sampling Without Replacement. SIAM J. Comput. 9(1): 111-113(1980) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Yao77]
S. Bing Yao: Approximating the Number of Accesses in Database Organizations. Commun. ACM 20(4): 260-261(1977) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Copyright © Tue Mar 16 02:22:00 2010 by Michael Ley (ley@uni-trier.de)