![]() |
![]() |
![]() |
@inproceedings{DBLP:conf/vldb/HuaL91,
author = {Kien A. Hua and
Chiang Lee},
editor = {Guy M. Lohman and
Am\'{\i}lcar Sernadas and
Rafael Camps},
title = {Handling Data Skew in Multiprocessor Database Computers Using
Partition Tuning},
booktitle = {17th International Conference on Very Large Data Bases, September
3-6, 1991, Barcelona, Catalonia, Spain, Proceedings},
publisher = {Morgan Kaufmann},
year = {1991},
isbn = {1-55860-150-3},
pages = {525-535},
ee = {db/conf/vldb/HuaL91.html},
crossref = {DBLP:conf/vldb/91},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Shared nothing multiprocessor architecture is known to be more scalable to support very large databases. Compared to other join strategies, a hash-based join algorithm is particularly efficient and easily parallelized for this computation model. However, this hardware structure is very sensitive to the data skew problem. Unless the parallel hash join algorithm includes some load balancing mechanism,skew effect can deteriorate the system performance severely.
In this paper, we propose two skew avoidance techniques and one skew resolution method. In particular, three new parallel hash join algorithms are presented. We developed an analytical model to study the effectiveness of these algorithms. The performance study indicates that the proposed techniques offer substantial improvement over the conventional strategies in the presence of data skew. It is also interesting to observe that the skew avoidance techniques provide join strategies that are robust against data skew; whereas the skew resolution method offers an adaptive join strategy that outperforms the conventional algorithms for any skew condition.
Copyright © 1991 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.