## **Database Architectures for New Hardware** Anastassia Ailamaki Computer Science Department Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 U.S.A. e-mail: natassa@cmu.edu ## **Abstract** Thirty years ago, DBMS stored data on disks and cached recently used data in main memory buffer pools, while designers worried about improving I/O performance and maximizing main memory utilization. Today, however, databases live in multi-level memory hierarchies that include disks, main memories, and several levels of processor caches. Four (often correlated) factors have shifted the performance bottleneck of data-intensive commercial workloads from I/O to the processor and memory subsystem. First, storage systems are becoming faster and more intelligent (now disks come complete with their own processors and caches). Second, modern database storage managers aggressively improve locality through clustering, hide I/O latencies using prefetching, and parallelize disk accesses using data striping. Third, main memories have become much larger and often hold the application's working set. Finally, the increasing memory/processor speed gap has pronounced the importance of processor caches to database performance. This tutorial will first survey the computer architecture and database literature on understanding and evaluating database application performance on modern hardware. We will present approaches and methodologies used to produce time breakdowns when executing database workloads on modern processors. We will contrast traditional methods that use system simulation to the more realistic, yet challenging use of hardware event counters. Then, we will survey techniques proposed in the literature to alleviate the problem and their evaluation. We will Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment Proceedings of the 30<sup>th</sup> VLDB Conference, Toronto, Canada, 2004 emphasize the importance and explain the challenges when determining the optimal data placement on all levels of memory hierarchy, and contrast to other approaches such as prefetching data and instructions. Finally, we will discuss open problems and future directions: Is it only the memory subsystem database software architects should worry about? How important are other decisions processors make to database workload behavior? Given the emerging multi-threaded, multi-processor computers with modular, deep cache hierarchies, how feasible is it to create database systems that will adapt to their environment and will automatically take full advantage of the underlying hierarchy? ## About the speaker Anastassia Ailamaki received a B.Sc. degree in Computer Engineering from the Polytechnic School of the University of Patra, Greece, M.Sc. degrees from the Technical University of Crete, Greece and from the University of Rochester, NY, and a Ph.D. degree in Computer Science from the University of Wisconsin-Madison. In 2001, she joined the Computer Science Department at Carnegie Mellon University as an Assistant Professor. Her research interests are in the broad area of database systems and applications, with emphasis on database system behavior on modern processor hardware and disks. Her projects at Carnegie Mellon (including Staged Database Systems, Cache-Resident Data Bases, and the Fates Storage Manager), aim at building systems to strengthen the interaction between the database software and the underlying hardware and I/O devices. Her other research interests include automated database design for scientific databases, storage device modeling, and internet querying. She has received three best-paper awards (VLDB 2001, Performance 2002, and ICDE 2004), an NSF CAREER award (2002), and IBM Faculty Partnership awards in 2001, 2002, and 2003. She is a member of IEEE and ACM.