VLDB 2024: Keynote Speakers
27Aug
Databases for All of the World's Bytes or How I Learned to Start Querying and Love AI
Speaker:
(MIT College of Computing Distinguished Professor of Computing)
Abstract:
Over the past five decades, The relational database model has proven to be a scaleable and adaptable
model for querying a variety of structured data, with use cases in analytics, transactions, graphs,
streaming and more. However, most of the world’s data is unstructured. Thus, despite their success,
the reality is that the vast majority of the world’s data has remained beyond the reach of relational
systems.
The rise of deep learning and generative AI offers an opportunity to change this. These
models provide a stunning capability to extract semantic understanding from almost any type of
document, including text, images, and video which can extend the reach of databases to all the
world's data. In this talk I explore how these new technologies will transform the way we build
database management software, creating new that systems that can ingest, store, process, and query
all data. Building such systems presents many opportunities and challenges. In this I talk focus on
three: scalability, correctness, and reliability, and argue that the declarative programming paradigm
that has served relational systems so well offers a path forward in the new world of AI data systems
as well. To illustrate this, I describe several examples of such declarative AI systems we have built
in document and video processing, and provide a set of research challenges and opportunities to guide
research in this exciting area going forward.
Bio:
Samuel Madden is a the College of Computing Distinguished Professor of Computing at MIT. His research
interests include databases, distributed computing, and networking. Research projects include learned
database systems, the C-Store column-oriented database system, and the CarTel mobile sensor network
system. Madden heads the Data Systems Group at MIT and the Data Science and AI Lab (DSAIL), an
industry supported collaboration focused on developing systems that use AI and machine learning.
Madden received his Ph.D. from the University of California at Berkeley in 2003 where he worked on
the TinyDB system for data collection from sensor networks. Madden was named one of Technology
Review's Top 35 Under 35 in 2005 and an ACM Fellow in 2020, and is the recipient of several awards,
including an NSF CAREER award, a Sloan Foundation Fellowship, the ACM SIGMOD Edgar F. Codd Innovations
Award, and "test of time" awards from VLDB, SIGMOD, SIGMOBILE, and SenSys. He is the co-founder and
Chief Scientist at Cambridge Mobile Telematics, which develops technology to make roads safer and
drivers better.
28Aug
Harmonizing ML and Databases: A Symphony of Data (Principal Engineer at Systems Research@Google) Abstract: Large language models (LLMs) are rapidly transforming the landscape of computing and daily life, demonstrating immense potential across diverse applications like natural language processing, machine translation, and code generation. This talk delves into the impact of LLMs on database research. Specifically, we'll examine how LLMs are fueling innovation in natural language interfaces for data interaction, highlighting current limitations and advocating for semantic data models and enhanced context to improve the accuracy of these solutions. Drawing inspiration from LLMs, we'll introduce a novel paradigm for database cost modeling, leveraging pre-trained models and fine-tuning techniques. We'll share our early-stage prototype, initial results, and outline a research roadmap highlighting numerous exciting challenges in this evolving field. Bio: Fatma Ozcan is a Principal Engineer at Systems Research@Google. Before that, she was a Distinguished Research Staff Member and a senior manager at IBM Almaden Research Center. Her current research focuses on ML for databases, NL2SQL, platforms and infra-structure for large-scale data analysis. Dr.Ozcan got her PhD degree in computer science from University of Maryland, College Park, and her BSc degree in computer engineering from METU, Ankara. She has over 23 years of experience in industrial research, and has delivered core technologies into data management products. She has been a contributor to various SQL standards, including SQL/XML, SQL/JSON and SQL/PTF. She co-authored several conference papers and patents and received the VLDB Women in Database Research Award in 2022. She is an ACM Distinguished Member, and the vice chair of ACM SIGMOD.
29Aug
Sharing Information with Differential Privacy: A Database Perspective (School of Computing, National University of Singapore) Abstract: In the digital age, the widespread collection and analysis of data pose significant privacy challenges. Differential privacy (DP) has emerged as a leading framework for ensuring that information release does not compromise individual privacy. In this talk, we will delve into the theoretical and practical aspects of achieving DP from a database perspective. We will start by examining database reconstruction attacks and their implications. We will then explore the design of DP query processing techniques, as well as the generation of synthetic databases under DP. Finally, we will discuss future directions for research in DP data management. Bio: Xiaokui Xiao is a professor at the School of Computing, National University of Singapore. His research focuses on data management and analytics, especially on data privacy and algorithms for large data. He is a co-recipient of the VLDB 2021 Best Research Paper Award, the 2022 ACM SIGMOD Research Highlight Award, and the 2024 ACM SIGMOD Test-of-Time Award. He is an IEEE fellow, an ACM distinguished member, and a trustee of the VLDB Endowment.