![]() |
![]() |
![]() |
@inproceedings{DBLP:conf/vldb/IshikawaSF98,
author = {Yoshiharu Ishikawa and
Ravishankar Subramanya and
Christos Faloutsos},
editor = {Ashish Gupta and
Oded Shmueli and
Jennifer Widom},
title = {MindReader: Querying Databases Through Multiple Examples},
booktitle = {VLDB'98, Proceedings of 24rd International Conference on Very
Large Data Bases, August 24-27, 1998, New York City, New York,
USA},
publisher = {Morgan Kaufmann},
year = {1998},
isbn = {1-55860-566-5},
pages = {218-227},
ee = {db/conf/vldb/IshikawaSF98.html},
crossref = {DBLP:conf/vldb/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Users often can not easily express their queries. For example, in a multimedia/image by content setting, the user might want photographs with sunsets; in current systems, like QBIC, the user has to give a sample query, andto specify the relative importance of color, shape and texture. Even worse, the user might want correlations between attributes, like, for example, in a traditional, medical record database, a medical researcher might wantto find "mildly overweight patients", where the implied query would be "weight/height ~ 4 lb/inch".
Our goal is to provide a user-friendly, but theoretically solid method, tohandle such queries. We allow the user to give several examples, and, optionally, their 'goodness' scores, and we propose a novel method to "guess" which attributes are important, which correlations are important, and withwhat weight.
Our contributions are twofold: (a) we formalize the problem as a minimization problem and show how to solve for the optimal solution, completely avoiding the ad-hoc heuristics of the past. (b) Moreover, we are the first that can handle 'diagonal' queries (like the 'overweight' query above). Experiments on synthetic and real datasets show that our method estimates quickly and accurately the 'hidden' distance function in the user's mind.
Copyright © 1998 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.