ACM SIGMOD Anthology TODS dblp.uni-trier.de

GlOSS: Text-Source Discovery over the Internet.

Luis Gravano, Hector Garcia-Molina, Anthony Tomasic: GlOSS: Text-Source Discovery over the Internet. ACM Trans. Database Syst. 24(2): 229-264(1999)
@article{DBLP:journals/tods/GravanoGT99,
  author    = {Luis Gravano and
               Hector Garcia-Molina and
               Anthony Tomasic},
  title     = {GlOSS: Text-Source Discovery over the Internet},
  journal   = {ACM Trans. Database Syst.},
  volume    = {24},
  number    = {2},
  year      = {1999},
  pages     = {229-264},
  ee        = {db/journals/tods/GravanoGT99.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

The dramatic growth of the Internet has created a new problem for users: location of the relevant sources of documents. This article presents a framework for (and experimentally analyzes a solution to) this problem, which we call the text-source discovery problem. Our approach consists of two phases. First, each text source exports its contents to a centralized service. Second, users present queries to the service, which returns an ordered list of promising text sources. This article describes GlOSS, Glossary of Servers Server, with two versions: bGlOSS, which provides a Boolean query retrieval model, and vGlOSS, which provides a vector-space retrieval model. We also present hGlOSS, which provides a decentralized version of the system. We extensively describe the methodology for measuring the retrieval effectiveness of these systems and provide experimental evidence, based on actual data, that all three systems are highly effective in determining promising text sources for a given query.

Copyright © 1999 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ... BibTeX

Online Edition: ACM Digital Library

[Abstract and Index Terms]
[Full Text in PDF, 225 KB]

References

[Barbara and Clifton 1992]
Daniel Barbará, Chris Clifton: Information Brokers: Sharing Knowledge in a Heterogeneous Distributed System. DEXA 1993: 80-91 BibTeX
[Bowman et al. 1994]
...
[Callan and Croft 1995]
James P. Callan, Zhihong Lu, W. Bruce Croft: Searching Distributed Collections with Inference Networks. SIGIR 1995: 21-28 BibTeX
[Chamis 1988]
...
[Chang et al. 1996]
Kevin Chen-Chuan Chang, Hector Garcia-Molina, Andreas Paepcke: Boolean Query Mapping Across Heterogeneous Information Sources. IEEE Trans. Knowl. Data Eng. 8(4): 515-521(1996) BibTeX
[Danzig et al. 1991]
Peter B. Danzig, Jong Suk Ahn, John Noll, Katia Obraczka: Distributed Indexing: A Scalable Mechanism for Distributed Information Retrieval. SIGIR 1991: 220-229 BibTeX
[Danzig et al. 1992]
Peter B. Danzig, Shih-Hao Li, Katia Obraczka: Distributed Indexing of Autonomous Internet Services. Computing Systems 5(4): 433-459(1992) BibTeX
[Dolin et al. 1996]
Ron Dolin, Divyakant Agrawal, Amr El Abbadi, Laura K. Dillon: Pharos: A Scalable Distributed Architecture for Locating Heterogeneous Information Sources. CIKM 1997: 348-355 BibTeX
[Duda and Sheldon 1994]
Andrzej Duda, Mark A. Sheldon: Content Routing in a Network of WAIS Servers. ICDCS 1994: 124-132 BibTeX
[Flater and Yesha 1993]
David W. Flater, Yelena Yesha: An Information Retrieval System for Network Resources. NGITS 1993: 0- BibTeX
[French et al. 1998]
James C. French, Allison L. Powell, Charles L. Viles, Travis Emmitt, Kevin J. Prey: Evaluating Database Selection Techniques: A Testbed and Experiment. SIGIR 1998: 121-129 BibTeX
[Fullton et al. 1993]
...
[Gravano et al. 1997]
Luis Gravano, Kevin Chen-Chuan Chang, Hector Garcia-Molina, Andreas Paepcke: STARTS: Stanford Proposal for Internet Meta-Searching (Experience Paper). SIGMOD Conference 1997: 207-218 BibTeX
[Gravano and Garcia-Molina 1995a]
Luis Gravano, Hector Garcia-Molina: Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies. VLDB 1995: 78-89 BibTeX
[Gravano and Garcia-Molina 1995b]
...
[Gravano and Garcia-Molina 1997]
Luis Gravano, Hector Garcia-Molina: Merging Ranks from Heterogeneous Internet Sources. VLDB 1997: 196-205 BibTeX
[Gravano et al. 1993]
...
[Gravano et al. 1994a]
Luis Gravano, Hector Garcia-Molina, Anthony Tomasic: The Effectiveness of GlOSS for the Text Database Discovery Problem. SIGMOD Conference 1994: 126-137 BibTeX
[Gravano et al. 1994b]
Luis Gravano, Hector Garcia-Molina, Anthony Tomasic: Precision and Recall of GlOSS Estimators for Database Discovery. PDIS 1994: 103-106 BibTeX
[Kahle and Medlar 1991]
...
[Morris et al. 1993]
...
[Morris et al. 1992]
...
[Neuman 1992]
B. Clifford Neuman: The Prospero File System: A Global File System Based on the Virtual System Model. Computing Systems 5(4): 407-432(1992) BibTeX
[Obraczka et al. 1993]
Katia Obraczka, Peter B. Danzig, Shih-Hao Li: Internet Resource Discovery Services. IEEE Computer 26(9): 8-22(1993) BibTeX
[Ordille and Miller 1992]
Joann J. Ordille, Barton P. Miller: Distributed Active Catalogs and Meta-Data Caching in Descriptive Name Services. ICDCS 1993: 120-129 BibTeX
[Salton 1989]
Gerard Salton: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley 1989, ISBN 0-201-12227-8
BibTeX
[Salton et al. 1983]
...
[Salton and McGill 1983]
Gerard Salton, Michael McGill: Introduction to Modern Information Retrieval. McGraw-Hill Book Company 1984, ISBN 0-07-054484-0
BibTeX
[Schwartz 1990]
...
[Schwartz 1993]
Michael F. Schwartz: Internet Resource Discovery at the University of Colorado. IEEE Computer 26(9): 25-35(1993) BibTeX
[Schwartz et al. 1992]
Michael F. Schwartz, Alan Emtage, Brewster Kahle, B. Clifford Neuman: A Comparison of Internet Resource Discovery Approaches. Computing Systems 5(4): 461-493(1992) BibTeX
[Selberg and Etzioni 1995]
...
[Sheldon et al. 1994]
Mark A. Sheldon, Andrzej Duda, Ron Weiss, James O'Toole, David K. Gifford: Content Routing for Distributed Information Servers. EDBT 1994: 109-122 BibTeX
[Simpson and Alonso 1989]
...
[Tomasic et al. 1997]
Anthony Tomasic, Luis Gravano, Calvin Lue, Peter M. Schwarz, Laura M. Haas: Data Structures for Efficient Broker Implementation. ACM Trans. Inf. Syst. 15(3): 223-253(1997) BibTeX
[Voorhees et al. 1995]
Ellen M. Voorhees, Narendra Kumar Gupta, Ben Johnson-Laird: The Collection Fusion Problem. TREC 1994: 0- BibTeX
[Yan and Garcia-Molina 1995]
Tak W. Yan, Hector Garcia-Molina: SIFT - a Tool for Wide-Area Information Dissemination. USENIX Winter 1995: 177-186 BibTeX
[Zahir and Chang 1992]
Sajjad Zahir, Chew Lik Chang: Online-Expert: An Expert System for Online Database Selection. JASIS 43(5): 340-357(1992) BibTeX

Referenced by

  1. Daniela Florescu, Alon Y. Levy, Alberto O. Mendelzon: Database Techniques for the World-Wide Web: A Survey. SIGMOD Record 27(3): 59-74(1998)
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
TODS, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Wed Jun 4 19:23:49 2008