VLDB 2022: Invited Industry Talks

07Sep

Sydney Time 13:30 – 15:00Invited Industry Talks Chaired by C. Mohan (Tsinghua University)

Scaling Way Up: Observations from a decade of building large-scale, distributed database infrastructure at Google John Cieslewicz  (Google) Abstract: More than a decade ago, Google began work on a new generation of database infrastructure. Starting from scratch, this infrastructure was built from the ground up for highly distributed, cloud-like deployments -- both Google's large-scale cluster management platform and, later, public cloud. Sharding and otherwise scaling existing infrastructure options had reached its limit: for example, Google AdWords was running on sharded MySQL. This next generation of infrastructure, led by the co-development of Spanner and F1 at the time, was built with first principles around distributed and flexible geographic deployment, nearly unlimited scaling of parallelism, and designed to use data-center scale services. This talk marks the ten-year anniversary of components of this database infrastructure enabling Google's advertising and many other products. Reflecting on these ten years, this talk examines innovation that made the development and use of these systems unique as well as the application of traditional database technologies and thinking that was critical in this new context. Bio: Dr. John Cieslewicz is a Senior Director of Engineering at Google where he leads the query technology teams within Google's Core Infrastructure group including F1 Query and F1 Lightning, Google's SQL language dialect, as well as internal systems focused on multi-dimensional data modeling and analysis. A Googler for more than a decade, John was an early member of the F1 team and is the architect of its query planning infrastructure. Before Google, John was a software engineer at Aster Data, a distributed database startup acquired by Teradata. John holds a PhD in Computer Science from Columbia University where he was a U.S. Department of Homeland Security Graduate Research Fellow. John also graduated from Stanford University with a degree in Computer Science with Honors in International Security Studies

Challenges in Evolving a Successful Database Product, SQL Server, to a Cloud Service, SQL Azure Hanuma Kodavalla (Microsoft) Abstract: Over the past few years, my team at Microsoft worked on evolving our very successful on-prem database product, SQL Server, to a Database Platform as a Service, SQL Azure, running in the cloud managing millions of databases in many regions globally. Mission-critical services offered by Microsoft as well as by customers worldwide in diverse industries depend on SQL Azure. I’ll describe the problems we faced in evolving a mature on-prem product to cloud supporting from very small to very large databases with guaranteed SLAs, various improvements in the areas of elasticity, high availability, recovery, security and query processing, and software engineering challenges in deploying changes to a complex codebase made by hundreds of engineers to live systems. Bio: Hanuma Kodavalla is a Technical Fellow in the Azure Databases group at Microsoft where he has been for twenty years. He previously worked at Data General, Digital Equipment Corporation, Oracle, Sybase and Asera. For more than three decades, Hanuma worked on many aspects of Relational Database Systems and has been instrumental in architecting multiple commercial database systems for high performance and high availability. Hanuma received BTech in Electronics and Communications in 1981 from National Institute of Technology, Warangal, India, MTech in Computer Science in 1983 from Indian Institute of Technology, Chennai, India, and MS in Computer Science in 1988 from University of Massachusetts, Amherst, USA. He has a few publications in database conferences and many patents related to novel implementation techniques for online transaction processing and data warehousing in the areas of concurrency control, recovery, high-availability, query processing and security.

Data Management innovation at Amazon Web Services. Ippokratis Pandis (Amazon Web Services) Abstract: Amazon Web Services is the market leader among cloud vendors. AWS provides the broadest portfolio of data management services and the rate of innovation at AWS accelerates. This talk focuses on the innovation in the broader area of data management at AWS. We start from the infrastructure, compute and storage, and then discuss innovations in the areas of databases, analytics, including data lakes and streams, integration and ETL as well as UX/visualization. Bio: Ippokratis Pandis is a VP/Distinguished Engineer at Amazon Web Services. He spends most of his time on AWS's Analytics services, especially Amazon Redshift. Redshift is Amazon's fully managed, petabyte-scale data warehouse service. Previously, Ippokratis has held positions as software engineer at Cloudera where he worked on the Impala SQL-on-Hadoop query engine, and as member of the research staff at the IBM Almaden Research Center, where he worked on IBM DB2 BLU. Ippokratis received his PhD from the ECE department at Carnegie Mellon University. He is the recipient of a Test-of-Time award at EDBT 2019. He is the General Chair of SIGMOD 2023 and the president of HPTS.