A Multi-tenant Relational OLTP Database at Salesforce

Authors:
Vaibhav Arora, Subho Chatterjee, Terry Chong, Thomas Fanghaenel, Pat Helland, Jamie Martin, Kaushal Mittal, Nat Wyatt
Abstract

Salesforce Database (SalesforceDB) is a relational OLTP database that directly supports the Salesforce multi-tenant application model using an LSM-based (Log-Structured Merge) storage engine. LSM-based OLTP databases offer many advantages including excellent write throughput and reduced cross-node coordination in shared storage architectures. However, LSMs pose challenges in managing the performance cost of reading data across multiple levels. Our read-dominated workloads comprise both single-record probes and range scans. SalesforceDB implements well-known read optimization techniques including Bloom filters, leveling-based compaction, and fence pointers. While these techniques improve both probe and scan performance, they are insufficient for our workloads. This paper presents three additional optimizations to LSM read performance: a location cache to accelerate key probes, range filters to improve short range scan performance, and early tombstone pruning to minimize tombstone overhead in queue-organized tables. All three optimizations are deployed in production, and we demonstrate their efficacy on real-world workloads.