My students alternate between incredulity and disbelief when I tell them about the not-so ``good old days.'' Back then, in the nineties (the 1990's, not the 1890's), to search the literature on a topic, we first mentally scanned our memory of related papers that we had read. We would then go to our files, hoping we had a copy, or to our bookshelves, hoping we had the proceedings or journal. We spent extravagant amounts of money subscribing to journals so that we would have fast access to these papers.
If the paper wasn't in our office, we would ask around to our colleagues, or trudge down to the library, which, while much more complete than our office, nevertheless sometimes didn't have that particular volume of the conference, because it was checked out or lost, or because the subscription had started years after the first issue had come out. For such papers, we would write to the author, asking them to send us a copy, or use interlibrary loan, either of which added weeks of delay.
Once we got the paper, we immediately went to the bibliography for pointers to related work. Of course, all the references were to earlier papers, which were often less useful, and often harder to find than the original paper. What we really wanted were more recent papers that built on this paper. So we manually scanned the subsequent conferences and journals, looking for other papers by the authors we had identified, as well as titles that looked relevant. This scanning was quite time-consuming, as we had to pull down each journal and conference proceedings from the shelf, and scan through its table of contents. The more enterprising researchers would go to the citation index, pulling out massive volumes, looking up the papers found thus far, and seeing which papers referenced them.
This process in its entirety often took months, with hours in the library punctuated by weeks of waiting for papers, which would spur the need for more papers. All the while was the lingering doubt, or rather the grudging acceptance, that some relevant papers would be missed. The hope was that those papers we weren't lucky enough to encounter didn't invalidate our work.
When I reminisce about this, amazed at the effort involved, my students invariably ask, ``Why didn't you simply search the SIGMOD Anthology, and get the papers directly?'' I have to remind them that the Anthology didn't exist until mid-1999. The students (and researchers and professors) today have it so much easier (a common refrain of us old-timers). They can search hundreds of papers for a technical term, finding all the papers that contained that term. (To do so manually is out of the question; we had to rely on visually scanning keywords.) Given a paper, they can quickly locate the papers that referenced that paper, thereby going forward in time. They can just as easily call up a list of all the papers an author has written. The paper can be read online, with its formatting intact; only the most relevant papers need be printed.
The Anthology represents a qualitative increase in the access provided to the database literature. It initiates a phase-shift in the scientific process. Research has been forever changed; there is no looking back.
Of course, in another decade or so, my students will talk of their not-so good old days, when they had to flip multiple disks, containing only a subset of the corpus. In contrast, their students will find everything instantly on the internet, via high-speed connections available everywhere, including homes, dorm rooms, planes, trains and cars. I feel, however, that the Anthology on CDRom is a requisite step in transitioning from the paper-oriented resources of the 1990's to the fully digital, web-based resources of the 2010's.
This paradigm shift, to use a term coined by Thomas Kuhn, was made possible by a confluence of factors. Michael Ley had long worked to populate his DBLP Bibliography, attempting to include all database papers. Inexpensive scanning and OCR (optical character recognition) enables digitization of papers. A substantial SIGMOD fund balance permitted the Executive Committee to take the bold step of digitizing all of ACM's material related to databases; this first volume is but a small subset of the corpus that will be collected and made available to the community. The VLDB Foundation, urged on by Stefano Ceri, was an early supporter, increasing the momentum by agreeing to have the VLDB proceedings included, and paying for its digitization. David Lomet, Editor-in-Chief of the IEEE Data Engineering Bulletin, was also an enthusiastic proponent, making available the content of that publication, which necessitated acquiring permission from many, many authors.
Gathering the material was but the first step. Michael Ley deserves the community's clamorous appreciation for fashioning from these manifold files a coherent and effective resource. In the end, it was Michael who made this happen.