Review - Semantic Compression and Pattern Extraction with Fascicles.

Richard T. Snodgrass: Review - Semantic Compression and Pattern Extraction with Fascicles. ACM SIGMOD Digital Review 2: (2000)

Review

The central, very simple idea of this paper is that often records in a table share identical or similar values for several attributes; this nonuniformity in the data provides opportunities for compressing the table and for mining useful patterns. Fascicles are subsets of a relation that share similar values. After fascicles are identified, they can be used to reorder the data, improving the space savings of subsequently applied syntactic compression. One can also sumarize the information in the fascicles, then compress the rest of the table, achieving an even greater space savings. A careful presentation, a well done evaluation and impressive compression results add up to a very nice paper.

The reader is forwarned that the presentation is quite dense in places; prepare to spend time on the paper to get the most out of it.

References

[1]: H. V. Jagadish, J. Madar, Raymond T. Ng: Semantic Compression and Pattern Extraction with Fascicles. VLDB 1999: 186-198