ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Searching Large Lexicons for Partially Specified Terms using Compressed Inverted Files.

Justin Zobel, Alistair Moffat, Ron Sacks-Davis: Searching Large Lexicons for Partially Specified Terms using Compressed Inverted Files. VLDB 1993: 290-301
@inproceedings{DBLP:conf/vldb/ZobelMS93,
  author    = {Justin Zobel and
               Alistair Moffat and
               Ron Sacks-Davis},
  editor    = {Rakesh Agrawal and
               Se{\'a}n Baker and
               David A. Bell},
  title     = {Searching Large Lexicons for Partially Specified Terms using
               Compressed Inverted Files},
  booktitle = {19th International Conference on Very Large Data Bases, August
               24-27, 1993, Dublin, Ireland, Proceedings},
  publisher = {Morgan Kaufmann},
  year      = {1993},
  isbn      = {1-55860-152-X},
  pages     = {290-301},
  ee        = {db/conf/vldb/ZobelMS93.html},
  crossref  = {DBLP:conf/vldb/93},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

There are many advantages to be gained by storing the lexicon of a full text database in main memory. In this paper we describe how to use a compressed inverted file index to searchsuch a lexicon for entries that match a pattern or partially specified term. This method provides an effective compromise between speed and space, running orders of magnitude faster than brute force search, but requiring less memory than other pattern - matching data structures; indeed, in some cases requiring less memory than would be consumed by a single pointer to each string. The pattern search method is based on text indexing techniques and is a successful adaptation of inverted files to main memory databases.

Copyright © 1993 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

Rakesh Agrawal, Seán Baker, David A. Bell (Eds.): 19th International Conference on Very Large Data Bases, August 24-27, 1993, Dublin, Ireland, Proceedings. Morgan Kaufmann 1993, ISBN 1-55860-152-X
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

References

[1]
Alfred V. Aho, Margaret J. Corasick: Efficient String Matching: An Aid to Bibliographic Search. Commun. ACM 18(6): 333-340(1975) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[2]
Abraham Bookstein, Shmuel T. Klein: Compression of a Set of Correlated Bitmaps. SIGIR 1991: 63-71 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[3]
Abraham Bookstein, Shmuel T. Klein, Donald A. Ziff: A Systematic Approach to Compressing a Full-Text Retrieval System. Inf. Process. Manage. 28(6): 795-(1992) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[4]
...
[5]
Gordon V. Cormack, R. Nigel Horspool, M. Kaiserswerth: Practical Perfect Hashing. Comput. J. 28(1): 54-58(1985) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[6]
...
[7]
Edward A. Fox, Lenwood S. Heath, Qi Fan Chen, Amjad M. Daoud: Practical Minimal Perfect Hash Functions for Large Databases. Commun. ACM 35(1): 105-121(1992) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[8]
...
[9]
...
[10]
...
[11]
...
[12]
...
[13]
Edward M. McCreight: A Space-Economical Suffix Tree Construction Algorithm. J. ACM 23(2): 262-272(1976) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[14]
Alistair Moffat: Economical Inversion of Large Text Files. Computing Systems 5(2): 125-139(1992) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[15]
...
[16]
Alistair Moffat, Justin Zobel: Parameterised Compression for Sparse Bitmaps. SIGIR 1992: 274-285 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[17]
Donald R. Morrison: PATRICIA - Practical Algorithm To Retrieve Information Coded in Alphanumeric. J. ACM 15(4): 514-534(1968) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[18]
...
[19]
O. Owolabi, D. R. McGregor: Fast Approximate String Matching. Softw., Pract. Exper. 18(4): 387-393(1988) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[20]
...
[21]
Jukka Teuhola: A Compression Method for Clustered Bit-Vectors. Inf. Process. Lett. 7(6): 308-311(1978) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[22]
Esko Ukkonen: Approximate String Matching with q-grams and Maximal Matches. Theor. Comput. Sci. 92(1): 191-211(1992) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[23]
...
[24]
Justin Zobel, Alistair Moffat, Ron Sacks-Davis: An Efficient Indexing Technique for Full Text Databases. VLDB 1992: 352-362 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Copyright © Tue Mar 16 02:22:03 2010 by Michael Ley (ley@uni-trier.de)