ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Distance-Based Indexing for High-Dimensional Metric Spaces.

Tolga Bozkaya, Z. Meral Özsoyoglu: Distance-Based Indexing for High-Dimensional Metric Spaces. SIGMOD Conference 1997: 357-368
@inproceedings{DBLP:conf/sigmod/BozkayaO97,
  author    = {Tolga Bozkaya and
               Z. Meral {\"O}zsoyoglu},
  editor    = {Joan Peckham},
  title     = {Distance-Based Indexing for High-Dimensional Metric Spaces},
  booktitle = {SIGMOD 1997, Proceedings ACM SIGMOD International Conference
               on Management of Data, May 13-15, 1997, Tucson, Arizona, USA},
  publisher = {ACM Press},
  year      = {1997},
  pages     = {357-368},
  ee        = {http://doi.acm.org/10.1145/253260.253345, db/conf/sigmod/BozkayaO97.html},
  crossref  = {DBLP:conf/sigmod/97},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

In many database applications, one of the common queries is to find approximate matches to a given query item from a collection of data items. For example, given an image database, one may want to retrieve all images that are similar to a given query image. Distance based index structures are proposed for applications where the data domain is high dimensional, or the distance function used to compute distances between data objects is non-Euclidean. In this paper, we introduce a distance based index structure called multi-vantage point (mvp) tree for similarity queries on high-dimensional metric spaces. The mvp-tree uses more than one vantage point to partition the space into spherical cuts at each level. It also utilizes the pre-computed (at construction time) distances between the data points and the vantage points. We have done experiments to compare mvp-trees with vp-trees which have a similar partitioning strategy, but use only one vantage point at each level, and do not make use of the pre-computed distances. Empirical studies show that mvp-tree outperforms the vp-tree 20% to 80% for varying query ranges and different distance distributions.

Copyright © 1997 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

Online Version (ACM WWW Account required): Full Text in PDF Format

CDROM Version: Load the CDROM "Volume 1 Issue 1, SIGMOD '93-'97" and ...

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Joan Peckham (Ed.): SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13-15, 1997, Tucson, Arizona, USA. ACM Press 1997 BibTeX , SIGMOD Record 26(2), June 1997
Contents

Online Edition: ACM Digital Library

[Index Terms]
[Full Text in PDF Format, 1444 KB]

References

[AFA93]
Rakesh Agrawal, Christos Faloutsos, Arun N. Swami: Efficient Similarity Search In Sequence Databases. FODO 1993: 69-84 BibTeX
[BK73]
Walter A. Burkhard, Robert M. Keller: Some Approaches to Best-Match File Searching. Commun. ACM 16(4): 230-236(1973) BibTeX
[BKSS90]
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger: The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. SIGMOD Conference 1990: 322-331 BibTeX
[Bri95]
Sergey Brin: Near Neighbor Search in Large Metric Spaces. VLDB 1995: 574-584 BibTeX
[Chi94]
Tzi-cker Chiueh: Content-Based Image Indexing. VLDB 1994: 582-593 BibTeX
[FEF+94]
Christos Faloutsos, Ron Barber, Myron Flickner, Jim Hafner, Wayne Niblack, Dragutin Petkovic, William Equitz: Efficient and Effective Querying by Image Content. J. Intell. Inf. Syst. 3(3/4): 231-262(1994) BibTeX
[FRM94]
Christos Faloutsos, M. Ranganathan, Yannis Manolopoulos: Fast Subsequence Matching in Time-Series Databases. SIGMOD Conference 1994: 419-429 BibTeX
[Gut84]
Antonin Guttman: R-Trees: A Dynamic Index Structure for Spatial Searching. SIGMOD Conference 1984: 47-57 BibTeX
[Ott92]
...
[RKV95]
Nick Roussopoulos, Stephen Kelley, Frédéic Vincent: Nearest Neighbor Queries. SIGMOD Conference 1995: 71-79 BibTeX
[Sam89]
Hanan Samet: The Design and Analysis of Spatial Data Structures. Addison-Wesley 1990
BibTeX
[SRF87]
Timos K. Sellis, Nick Roussopoulos, Christos Faloutsos: The R+-Tree: A Dynamic Index for Multi-Dimensional Objects. VLDB 1987: 507-518 BibTeX
[SW90]
Dennis Shasha, Jason Tsong-Li Wang: New Techniques for Best-Match Retrieval. ACM Trans. Inf. Syst. 8(2): 140-158(1990) BibTeX
[Uhl91]
Jeffrey K. Uhlmann: Satisfying General Proximity/Similarity Queries with Metric Trees. Inf. Process. Lett. 40(4): 175-179(1991) BibTeX
[Yia93]
...

Referenced by

  1. Melliyal Annamalai, Rajiv Chopra, Samuel DeFazio: Indexing Images in Oracle8i. SIGMOD Conference 2000: 539-547
  2. Caetano Traina Jr., Agma J. M. Traina, Bernhard Seeger, Christos Faloutsos: Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes. EDBT 2000: 51-65
  3. Tolga Bozkaya, Z. Meral Özsoyoglu: Indexing Large Metric Spaces for Similarity Search Queries. ACM Trans. Database Syst. 24(3): 361-404(1999)
  4. Kevin S. Beyer, Jonathan Goldstein, Raghu Ramakrishnan, Uri Shaft: When Is ''Nearest Neighbor'' Meaningful? ICDT 1999: 217-235
  5. Kaushik Chakrabarti, Sharad Mehrotra: The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces. ICDE 1999: 440-447
  6. Pavel Zezula, Pasquale Savino, Giuseppe Amato, Fausto Rabitti: Approximate Similarity Retrieval with M-Trees. VLDB J. 7(4): 275-293(1998)
  7. Flip Korn, Nikolaos Sidiropoulos, Christos Faloutsos, Eliot Siegel, Zenon Protopapas: Fast and Effective Retrieval of Medical Tumor Shapes. IEEE Trans. Knowl. Data Eng. 10(6): 889-904(1998)
  8. Mihael Ankerst, Hans-Peter Kriegel, Thomas Seidl: A Multistep Approach for Shape Similarity Search in Image Databases. IEEE Trans. Knowl. Data Eng. 10(6): 996-1004(1998)
  9. Beng Chin Ooi, Cheng Hian Goh, Kian-Lee Tan: Fast High-Dimensional Data Search in Incomplete Databases. VLDB 1998: 357-367
  10. David T. Kao, R. Daniel Bergeron, Ted M. Sparr: Efficient Proximity Search in Multivariate Data. SSDBM 1998: 145-154
  11. Paolo Ciaccia, Marco Patella, Pavel Zezula: A Cost Model for Similarity Queries in Metric Spaces. PODS 1998: 59-68
  12. Paolo Ciaccia, Marco Patella, Pavel Zezula: Processing Complex Similarity Queries with Distance-Based Access Methods. EDBT 1998: 9-23
  13. Paolo Ciaccia, Marco Patella, Pavel Zezula: M-tree: An Efficient Access Method for Similarity Search in Metric Spaces. VLDB 1997: 426-435
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Wed Jun 4 18:55:26 2008