VLDB 2023: Schedule of Papers and Tutorials

Find Session: A1 A3 A4 A6 A7 A8 B1 B2 B3 B4 B5 C1 C2 C3 C4 C5 C6 C7 C8 D1 D2 D3 D4 D5 D6 D7 D8 E1 E2 E3 E4 E5 E6 E7 F1 F2 F3 F4 F5 G1 G2 G3 G4 G5 G7 G8 H1 H2 H3 H4 H6 H7 H8 R10 R11 R12 R13 R14 R15 R20 R21 R22 R23 R24 R25 R30 R31 R32 R33 R34 R35 U6 U7 V7 V8 Tutorial-1 Tutorial-2 Tutorial-3 Tutorial-4 Tutorial-5 Tutorial-6 Tutorial-7 Tutorial-9 Tutorial-10 Tutorial-11 Demo-Group-A Demo-Group-B Demo-Group-C R

Vision for DB Systems

Chair: Yuanyuan Tian (Microsoft Gray Systems Lab)

Towards Migration-Free Just-In-Case Data Archival for Future Cloud Data Lakes using Synthetic DNA [vision] Eugenio Marinelli (Eurecom)*; Yiqing Yan (Eurecom); Virginie Magnone (IPMC); Charlotte Dumargne (IPMC); Pascal Barbry (IPMC); Thomas Heinis (Imperial College); Raja Appuswamy (Eurecom)
Show Abstract Download Paper

The Composable Data Management System Manifesto [vision] Pedro Pedreira (Meta Platforms)*; Orri Erling (Meta Platforms); Konstantinos Karanasos (Meta); Scott Schneider (Meta Platforms); Wes McKinney (Voltron Data); Satya Valluri (Databricks); Mohamed Zait (Databricks); Jacques Nadeau (Sundeck)
Show Abstract Download Paper

Check Out the Big Brain on BRAD: Simplifying Cloud Data Processing with Learned Automated Data Meshes [vision] Tim Kraska (Massachusetts Institute of Technology); Tianyu Li (MIT CSAIL); Samuel Madden (Massachusetts Institute of Technology); Markos Markakis (Massachusetts Institute of Technology); Amadou L Ngom (Massachusetts Institute of Technology); Ziniu Wu (Massachusetts Institute of Technology); Geoffrey X. Yu (Massachusetts Institute of Technology)*
Show Abstract Download Paper

The Case for Distributed Shared-Memory Databases with RDMA-Enabled Memory Disaggregation [vision] Ruihong Wang (Purdue University); Jianguo Wang (Purdue University)*; Stratos Idreos (Harvard); M. Tamer Özsu (University of Waterloo); Walid G Aref (Purdue)
Show Abstract Download Paper

Keep CALM and CRDT On [vision] Shadaj Laddad (UC Berkeley); Conor Power (UC Berkeley)*; Mae Milano (UC Berkeley); Alvin Cheung (UC Berkeley); Natacha Crooks (UC Berkeley); Joseph M Hellerstein (UC Berkeley)
Show Abstract Download Paper

(Industrial) Performance & Resource Optimization for Cloud

Chair: Hanuma Kodavalla (Microsoft)

Taurus MM: bringing multi-master to the cloud [industry] Alex Depoutovitch (Huawei)*; Paul Larson (Huawei); Jack Ng (Huawei); Shu Lin (Huawei); Chong Chen (Huawei); Guanzhu Xiong (Huawei); Emad Boctor (Huawei); Paul Lee (Huawei); Samiao Ren (Huawei); Lengdong Wu (Huawei); Yuchen Zhang (Huawei); Calvin Sun (Huawei)
Show Abstract Download Paper

The Story of AWS Glue [industry] Mohit Saxena (Amazon Web Services)*; Benjamin Sowell (Aryn); Daiyan Alamgir (Amazon Web Services); Nitin Bahadur (Amazon Web Services); Bijay Bisht (Amazon Web Services); Santosh Chandrachood (Amazon Web Services); Chitti Keswani (Amazon Web Services); G2 Krishnamoorthy (Amazon Web Services); Austin Lee (Amazon Web Services); Bohou Li (Amazon Web Services); Zach Mitchell (Amazon Web Services); Vaibhav Porwal (Amazon Web Services); Maheedhar Reddy Chappidi (Amazon Web Services); Brian Ross (Amazon Web Services); Noritaka Sekiyama (Amazon Web Services); Omer Zaki (Amazon Web Services); Linchi Zhang (Amazon Web Services); Mehul Shah (Amazon)
Show Abstract Download Paper

Anser: Adaptive Information Sharing Framework of AnalyticDB [industry] Liang Lin (Alibaba); Yuhan Li (Alibaba Cloud Computing Co. Ltd.); Bin Wu (Alibaba Group)*; Huijun Mai (Alibaba); Renjie Lou (Alibaba); Jian Tan (Alibaba); Feifei Li (Alibaba Group)
Show Abstract Download Paper

Eigen: End-to-end Resource Optimization for Large-Scale Databases on the Cloud [industry] Ji You Li (Alibaba Group); Jiachi Zhang (Georgetown Univerisity); Wenchao Zhou (Alibaba Group)*; Yuhang Liu (alibaba); Shuai Zhang (alibaba); Xue Zhuoming (alibaba); Ding Xu (Alibaba); Hua Fan (Alibaba Group); Fangyuan Zhou (Alibaba Group); Feifei Li (Alibaba Group)
Show Abstract Download Paper

Automatic SQL Error Mitigation in Oracle [industry] Krishna Kantikiran Pasupuleti (Oracle)*; Jiakun Li (Oracle America); Hong Su (Oracle); Mohamed Ziauddin (Oracle)
Show Abstract Download Paper

(Industrial) ML + Systems

Chair: Avrilia Floratou (Microsoft Gray Systems Lab)

AutoSteer: Learned Query Optimization for Any SQL Database [industry] Christoph Anneser (Technical University of Munich)*; Nesime Tatbul (Intel Labs and MIT); David E Cohen (Intel); Zhenggang Xu (Meta Platforms); Prithviraj P Pandian (Meta); Nikolay Laptev (Facebook); Ryan C Marcus (Massachusetts Institute of Technology)
Show Abstract Download Paper

EmbedX: A Versatile, Efficient and Scalable Platform to Embed Both Graphs and High-Dimensional Sparse Data [industry] Yuanhang Zou (Tencent); Zhihao Ding (The Hong Kong Polytechnic University); Jieming Shi (The Hong Kong Polytechnic University)*; Shuting Guo (Tencent); Chunchen Su (Tencent); Yafei Zhang (Tencent)
Show Abstract Download Paper

MINT: Detecting Fraudulent Behaviors from Time-series Relational Data [industry] Fei Xiao (Shopee Singapore)*; Yuncheng Wu (National University of Singapore); Meihui Zhang (Beijing Institute of Technology); Gang Chen (Zhejiang University); Beng Chin Ooi (NUS)
Show Abstract Download Paper

PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel [industry] Yanli Zhao (Meta)*; Andrew Gu (Meta); Rohan Varma (Meta); Liang Luo (Meta); Chien-Chin Huang (Meta Platforms); Min Xu (Meta Platforms); Less Wright (Meta Platforms); Hamid Shojanazeri (Meta Platforms); Myle Ott (Facebook); Sam Shleifer (Stanford University); Alban Desmaison (Meta); Can Balioglu (Meta Platforms); Pritam H Damania (Meta Platforms); Bernard Nguyen (Meta Platforms); Geeta Chauhan (Meta Platforms); Yuchen Hao (Meta Platforms); Ajit Mathews (Meta); Shen Li (Meta)
Show Abstract Download Paper

(Industrial) Novel Systems for Real-World Uses

Chair: Jesus Camacho-Rodriguez (Microsoft Gray Systems Lab)

Progressive Partitioning for Parallelized Query Execution in Google's Napa [industry] Jun Tatemura (Google); Tao Zou (Google); Jagan Sankaranarayanan (Google)*; Yanlai Huang (Google); Jim Chen (Google); Yupu Zhang (Google); Kevin Lai (Google); Hao Zhang (Google); Gokul Nath Babu Manoharan (Google); Goetz Graefe (Google); Divyakant Agrawal (Google); Brad Adelberg (Google); Shilpa Kolhar (Google); Indrajit Roy (Google)
Show Abstract Download Paper

OceanBase Paetica: A Hybrid Shared-nothing/Shared-everything Database for Supporting Single Machine and Distributed Cluster [industry] Zhifeng Yang (OceanBase); Quanqing Xu (OceanBase)*; Shanyan Gao (OceanBase, Ant Group); Chuanhui Yang (OceanBase); Guoping Wang (OceanBase, Ant Group); Yuzhong Zhao (oceanbase); Fanyu Kong (Oceanbase); Hao Liu (OceanBase); Wanhong Wang (OceanBase, Ant Group); Jinliang Xiao (OceanBase, Ant Group)
Show Abstract Download Paper

PolarDB-SCC: A Cloud-Native Database Ensuring Low Latency for Strongly Consistent Reads [industry] Xinjun Yang (Alibaba Group); Yingqiang Zhang (Alibaba Group); Hao Chen (Alibaba Group)*; Chuan Sun (Alibaba Group); Feifei Li (Alibaba Group); Wenchao Zhou (Alibaba Group)
Show Abstract Download Paper

ScalarDB: Universal Transaction Manager for Polystores [industry] Hiroyuki Yamada (Scalar)*; Toshihiro Suzuki (Scalar); Yuji Ito (Scalar); Jun Nemoto (Scalar)
Show Abstract Download Paper

(Industrial) Data Governance, Lineage, & Benchmarking

Chair: Abdul Quamar (Google)

CDSBen: Benchmarking the Performance of Storage Services in Cloud-native Database System at ByteDance [industry] Jiashu Zhang (Southern University of Science and Technology); Wen Jiang (Southern University of Science and Technology); Bo Tang (Southern University of Science and Technology)*; Haoxiang Ma (ByteDance); Cao Lixun (ByteDance); Zhongbin Jiang (ByteDance); Yuanyuan Nie (ByteDance); Fan Wang (ByteDance); Lei Zhang (ByteDance); Yuming Liang (ByteDance)
Show Abstract Download Paper

Microsoft Purview: A System for Central Governance of Data [industry] Shafi Ahmad (Microsoft); Dillidorai Arumugam (Microsoft); Srdan Bozovic (Microsoft); Elnata Degefa (Microsoft); Sailesh K Duvvuri (C and AI); Steven Gott (Microsoft); Nitish Gupta (Microsoft); Joachim Hammer (Microsoft); Nivedita Kaluskar (Microsoft); Raghav Kaushik (Microsoft)*; Rakesh Khanduja (Microsoft); Prasad Mujumdar (Microsoft); Gaurav Malhotra (Microsoft); Pankaj Naik (Microsoft); Nikolas Ogg (Microsoft); Krishna Kumar Parthasarthy (Microsoft); Raghu Ramakrishnan (Microsoft); Vladimir Rodriguez (Microsoft); Rahul Sharma (Microsoft India R&D Pvt ltd); Jakub Szymaszek (Microsoft); Andreas Wolter (Microsoft)
Show Abstract Download Paper

TPCx-AI - An Industry Standard Benchmark for Artificial Intelligence and Machine Learning Systems [industry] Christoph Brücke (bankmark); Philipp Härtling (bankmark); Rodrigo D Escobar Palacios (Intel); Hamesh Patel (Intel); Tilmann Rabl (HPI, University of Potsdam)*
Show Abstract Download Paper

OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance From Database Query Event Logs [industry] Fotios Psallidas (Microsoft)*; Ashvin Agrawal (Microsoft); Chandru Sugunan (Snowflake); Khaled Ibrahim (Microsoft); Konstantinos Karanasos (Meta); Jesús Camacho-Rodríguez (Microsoft); Avrilia Floratou (Microsoft); Carlo Curino (Microsoft); Raghu Ramakrishnan (Microsoft)
Show Abstract Download Paper

(Industry) Real-Time & Stream Processing

Chair: Juan Colmenares (LinkedIn)

StreamOps: Cloud-Native Runtime Management for Streaming Services in ByteDance [industry] Yancan Mao (National University of Singapore)*; Zhanghao Chen (ByteDance); Yifan Zhang (ByteDance); Meng Wang (ByteDance); Yong Fang (ByteDance); Guanghui Zhang (ByteDance); Rui Shi (ByteDance); Richard T.B. Ma (National University of Singapore)
Show Abstract Download Paper

Krypton: Real-time Serving and Analytical SQL Engine at ByteDance [industry] Jianjun Chen (Bytedance)*; Rui Shi (ByteDance); Heng Chen (ByteDance); Li Zhang (ByteDance); Ruidong Li (Bytedance.com); Wei Ding (Bytedance); Liya Fan (Bytedance corporation); Hao Wang (ByteDance); Mu Xiong (ByteDance); Yuxiang Chen (ByteDance); Benchao Dong (Bytedance); Kuankuan Guo (Bytedance); Yuanjin Lin (ByteDance Technology Co Ltd.); Xiao Liu (Bytedance); Haiyang Shi (ByteDance); Peipei Wang (ByteDance); Zikang Wang (ByteDance Technology Co Ltd.); Yang Yemeng (ByteDance Ltd.); Junda Zhao (ByteDance); Dongyan Zhou (ByteDance); Zhikai Zuo (bytedance); Yuming Liang (ByteDance)
Show Abstract Download Paper

Techniques and Efficiencies from Building a Real-Time DBMS [industry] V Srinivasan (Aerospike)*; B Narendran (Aerospike); Andrew Gooding (Aerospike); Thomas Lopatic (Aerospike); Kevin Porter (Aerospike); Sunil Sayyaparaju (Aerospike); Ashish Shinde (Aerospike)
Show Abstract Download Paper

Lindorm TSDB: A Cloud-native Time-series Database for Large-scale Monitoring Systems [industry] Shen Chunhui (alibaba); Qianyu Ouyang (Alibaba); Feibo Li (Alibaba group); Liu Zhipeng (alibaba); Longcheng Zhu (Alibaba); Yujie Zou (Alibaba Group); Qing Su (Alibaba Cloud); Tianhuan Yu (alibaba-inc); Yi Yi (Alibaba Group); Jianhong Hu (Alibaba Group); Cen Zheng (Alibaba Group)*; Bo Wen (Alibaba); Hanbang Zheng (Alibaba Group); Lunfan Xu (Alibaba Group); Sicheng Pan (Alibaba Group); Bin Wu (Alibaba Group); Xiao He (Alibaba Group); Ye Li (Alibaba); Jian Tan (Alibaba); Sheng Wang (Alibaba Group); Dan Pei (Tsinghua University); Wei Zhang (Alibaba); Feifei Li (Alibaba Group)
Show Abstract Download Paper

Kora: A Cloud-Native Event Streaming Platform For Kafka [industry] Anna Povzner (Confluent)*; Prince Mahajan (Confluent); Jason Gustafson (Confluent); Jun Rao (Confluent); Ismael Juma (Confluent); Feng Min (Confluent); Shriram Sridharan (Confluent); Nikhil Bhatia (Confluent); Gopi Attaluri (Confluent); Adithya Chandra (Confluent); Stanislav Kozlovski (Confluent); Rajini Sivaram (Confluent); Lucas Bradstreet (Confluent); Bob Barrett (Confluent); Dhruvil Shah (Confluent); David Jacot (Confluent); David Arthur (Confluent); Manveer Chawla (Confluent); Ron Dagostino (Confluent); Colin McCabe (Confluent); Manikumar Reddy Obili (Confluent); Kowshik Prakasam (Confluent); Jose Garcia Sancio (Confluent); Vikas Singh (Confluent); Alok Nikhil (Confluent); Kamal Gupta (Confluent)
Show Abstract Download Paper

Foundations for Patterns, Constraints, & Dependencies

Chair: Matteo Lissandrini (Aalborg University)

Representing Paths in Graph Database Pattern Matching Wim Martens (University of Bayreuth)*; Matthias Niewerth (University of Bayreuth); Tina Popp (University of Bayreuth); Carlos Rojas (PUC); Stijn Vansummeren (Hasselt University); Domagoj Vrgoč (PUC)
Show Abstract Download Paper

Semi-Oblivious Chase Termination for Linear Existential Rules: An Experimental Study [eab] Marco Calautti (University of Milan); Mostafa Milani (The University of Western Ontario); Andreas Pieris (University of Edinburgh & University of Cyprus)*
Show Abstract Download Paper

Normalizing Property Graphs Philipp Skavantzos (The University of Auckland); Sebastian Link (University of Auckland)*
Show Abstract Download Paper

Exploiting the Power of Equality-Generating Dependencies in Ontological Reasoning Luigi Bellomarini (Banca d'Italia); Davide Benedetto (Università Roma Tre); Matteo Brandetti (TU Wien); Emanuel Sallinger (TU Wien)
Show Abstract Download Paper

Witness Generation for JSON Schema Lyes Attouche (Univerite Paris-Dauphine); Mohamed-Amine Baazizi (Sorbonne Universite); Dario Colazzo (Univ. Paris Dauphine - PSL); Giorgio Ghelli (Universita di Pisa); Carlo Sartiani (Università della Basilicata); Stefanie Scherzinger (University of Passau)
Show Abstract Download Paper

Provenance, Trace Capture, & Process Mining

Chair: Riccardo Tommasini (INSA Lyon - LIRIS)

Erebus: Explaining the Outputs of Data Streaming Queries Dimitris Palyvos-Giannas (Chalmers University of Technology)*; Katerina Tzompanaki (CY Cergy Paris University); Marina Papatriantafilou (Chalmers University of Technology); Vincenzo Gulisano (Chalmers University of Technology)
Show Abstract Download Paper

R^3: Record-Replay-Retroaction for Database-Backed Applications Qian Li (Stanford University)*; Peter Kraft (Stanford University); Michael Cafarella (MIT CSAIL); Çağatay Demiralp (Sigma Computing); Goetz Graefe (Google); Christos Kozyrakis (Stanford University); Michael Stonebraker (Massachusetts Institute of Technology); Lalith Suresh (VMware Research); Xiangyao Yu (University of Wisconsin-Madison); Matei Zaharia (Berkeley and Databricks)
Show Abstract Download Paper

An Experimental Evaluation of Process Concept Drift Detection [eab] Jan Niklas Adams (Chair for Process and Data Science, RWTH Aachen)*; Cameron Pitsch (RWTH Aachen University); Tobias Brockhoff (Chair of Process and Data Science, RWTH Aachen University); Wil M.P. van der Aalst (RWTH Aachen University)
Show Abstract Download Paper

Mining Frequent Infix Patterns from Concurrency-Aware Process Execution Variants Michael Martini (RWTH Aachen); Daniel Schuster (Fraunhofer-Institut für Angewandte Informationstechnik FIT)*; Wil M.P. van der Aalst (RWTH Aachen University)
Show Abstract Download Paper

View & Change Management

Chair: Wolfgang Gatterbauer (Northeastern University)

Online Schema Evolution is (Almost) Free for Snapshot Databases Tianxun Hu (Simon Fraser University)*; Tianzheng Wang (Simon Fraser University); Qingqing Zhou (Tencent)
Show Abstract Download Paper

Making Cache Monotonic and Consistent Shuai An (University of Edinburgh); Yang Cao (University of Edinburgh)*
Show Abstract Download Paper

DBSP: Automatic Incremental View Maintenance for Rich Query Languages Mihai Budiu (VMware Research)*; Tej Chajed (VMware Research); Frank McSherry (Materialize); Leonid Ryzhyk (VMware Research); Val Tannen (University of Pennsylvania)
Show Abstract Download Paper

SageDB: An Instance-Optimized Data Analytics System Jialin Ding (AWS); Ryan Marcus (MIT); Andreas Kipf (Amazon Web Services); Vikram Nathan (MIT); Aniruddha Nrusimha (MIT); Kapil Vaidya (MIT); Alexander van Renen (Friedrich-Alexander-Universität Erlangen-Nürnberg); Tim Kraska (MIT)
Show Abstract Download Paper

Information Integration & Mining

Chair: El Kindi Rezig (University of Utah)

Auto-BI: Automatically Build BI-Models Leveraging Local Join Prediction and Global Schema Graph Yiming Lin (University of California at Irvine); Yeye He (Microsoft Research)*; Surajit Chaudhuri (Microsoft Research)
Show Abstract Download Paper

Fast Algorithms for Denial Constraint Discovery Eduardo H. M. Pena (UTFPR)*; Fabio Porto (LNCC); Felix Naumann (Hasso Plattner Institute, University of Potsdam)
Show Abstract Download Paper

Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V Roee Shraga (Northeastern University)*; Renée J. Miller (Northeastern University)
Show Abstract Download Paper

Learning and Deducing Temporal Orders Wenfei Fan (University of Edinburgh); Resul Tugay (University of Edinburgh); Yaoshu Wang (Shenzhen Institute of Computing Sciences, Shenzhen University)*; Min Xie (Shenzhen Institute of Computing Sciences); Muhammad Asif Ali (King Abdullah University of Science and Technology)
Show Abstract Download Paper

Extraction of Validating Shapes from very large Knowledge Graphs [sds] Kashif Rabbani (Aalborg University Denmark)*; Matteo Lissandrini (Aalborg University); Katja Hose (TU Wien)
Show Abstract Download Paper

Similarity Join & Entity Resolution

Chair: Reynold Cheng (University of Hong Kong)

TokenJoin: Efficient Filtering for Set Similarity Join with Maximum Weighted Bipartite Matching Alexandros Zeakis (National and Kapodistrian University of Athens); Dimitrios Skoutas (Athena Research Center)*; Dimitris Sacharidis (ULB); Odysseas Papapetrou (TU Eindhoven); Manolis Koubarakis (University of Athens, Greece)
Show Abstract Download Paper

A Two-Level Signature Scheme for Stable Set Similarity Joins Daniel Schmitt (University of Salzburg)*; Daniel Kocher (University of Salzburg); Nikolaus Augsten (University of Salzburg); Willi Mann (Celonis SE); Alexander Miller (University of Salzburg)
Show Abstract Download Paper

Sparkly: A Simple yet Surprisingly Strong TF/IDF Blocker for Entity Matching Derek Paulsen (University of Wisconsin-Madison)*; Yash Govind (Apple); AnHai Doan (UW-Madison)
Show Abstract Download Paper

Pre-trained Embeddings for Entity Resolution: An Experimental Analysis [eab] Alexandros Zeakis (National and Kapodistrian University of Athens); George Papadakis (University of Athens); Dimitrios Skoutas (Athena Research Center)*; Manolis Koubarakis (University of Athens, Greece)
Show Abstract Download Paper

Through the Fairness Lens: Experimental Analysis and Evaluation of Entity Matching [eab] Nima Shahbazi (University of Illinois at Chicago)*; Nikola Danevski (University of Rochester); Fatemeh Nargesian (University of Rochester); Abolfazl Asudeh (University of Illinois Chicago); Divesh Srivastava (AT&T Chief Data Office)
Show Abstract Download Paper

Data Exploration/Transformation & Usability

Chair: Oliver Kennedy (University at Buffalo)

Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples Peng Li (Georgia Institute of Technology); Yeye He (Microsoft Research)*; Cong Yan (Microsoft Research); Yue Wang (Microsoft Research); Surajit Chaudhuri (Microsoft Research)
Show Abstract Download Paper

Transactional Panorama: A Conceptual Framework for User Perception in Analytical Visual Interfaces Dixin Tang (University of California at Berkeley)*; Alan Fekete (University of Sydney); Indranil Gupta (UIUC); Aditya G. Parameswaran (University of California at Berkeley)
Show Abstract Download Paper

FEDEX: An Explainability Framework for Data Exploration Steps Daniel Deutch (Tel Aviv University); Amir Gilad (Duke University); Tova Milo (Tel Aviv University); Amit Mualem (Tel Aviv University); Amit Somech (Bar-Ilan University)
Show Abstract Download Paper

Bolt-on, Compact, and Rapid Program Slicing for Notebooks [sds] Shreya Shankar (University of California Berkeley); Stephen Macke (University of Illinois at Urbana-Champaign); Sarah Chasins (UC Berkeley); Andrew Head (University of California, Berkeley); Aditya Parameswaran (University of California, Berkeley)
Show Abstract Download Paper

Indexing

Chair: Abolfazl Asudeh (University of Illinois at Chicago)

Towards Efficient Index Construction and Approximate Nearest Neighbor Search in High-Dimensional Spaces Xi Zhao (HKUST)*; Yao Tian (The Hong Kong University of Science and Technology); Kai Huang (HKUST); Bolong Zheng (Huazhong University of Science and Technology); Xiaofang Zhou (The Hong Kong University of Science and Technology)
Show Abstract Download Paper

CORE-Sketch: On Exact Computation of Median Absolute Deviation with Limited Space Haoquan Guan (Tsinghua University); Ziling Chen (Tsinghua University); Shaoxu Song (Tsinghua University)*
Show Abstract Download Paper

BP-tree: Overcoming the Point-Range Operation Tradeoff for In-Memory B-trees Helen Xu (Lawrence Berkeley National Laboratory)*; Amanda Li (Massachusetts Institute of Technology); Brian Wheatman (Johns Hopkins University); Manoj Marneni (University of Utah); Prashant Pandey (University of Utah)
Show Abstract Download Paper

B-trees are the go-to data structure for in-memory indexes in databases and storage systems. B-trees support both point operations (i.e. inserts and finds) and range operations (i.e. iterators and maps). However, there is an inherent tradeoff between point and range operations since the optimal node size for point operations is much smaller than the optimal node size for range operations. Existing implementations use a relatively small node size to achieve fast point operations at the cost of range operation throughput. We present the BP-tree, a variant of the B-tree, that overcomes the decades-old point-range operation tradeoff in traditional B-trees. In the BP-tree, the leaf nodes are much larger in size than the internal nodes to support faster range scans. To avoid any slowdown in point operations due to large leaf nodes, we introduce a new insert-optimized array called the buffered partitioned array (BPA) to efficiently organize data in leaf nodes. The BPA supports fast insertions by delaying ordering the keys in the array. This results in much faster range operations and faster point operations at the same time in the BP-tree. Our experiments show that on 48 hyperthreads, on workloads generated from the Yahoo! Cloud Serving Benchmark (YCSB), the BP- tree supports similar or faster point operation throughput (between .94×−1.2×) compared to Masstree and OpenBw-tree, two state-of- the-art in-memory key-value (KV) stores. On a YCSB workload with short scans, the BP-tree is about 7.4× faster than Masstree and 1.6× faster than OpenBw-tree. Furthermore, we extend the YCSB to add large range workloads, commonly found in database applications, and show that the BP-tree is 30× faster than Masstree and 2.5× faster than OpenBw-tree. We also provide a reference implementation for a concurrent B+-tree and find that the BP-tree supports faster (by between 1.03× −1.2×) point operations when compared to the best-case configuration for B+-trees for point operations while supporting similar performance (about .95×) on short range operations and faster (about 1.3×) long range operations.

Blink-hash: An Adaptive Hybrid Index for In-Memory Time-Series Databases Hokeun Cha (University of Wisconsin-Madison)*; Xiangpeng Hao (University of Wisconsin Madison); Tianzheng Wang (Simon Fraser University); Huanchen Zhang (Tsinghua University); Aditya Akella (UT Austin); Xiangyao Yu (University of Wisconsin-Madison)
Show Abstract Download Paper

ONe Index for All Kernels (ONIAK): A Zero Re-Indexing LSH Solution to ANNS-ALT (After Linear Transformation) Jingfan Meng (Georgia Institute of Technology); Huayi Wang (Georgia Institute of Technology); Jun Xu (Georgia Tech); Mitsunori Ogihara (University of Miami)
Show Abstract Download Paper

Foundation Models & Databases

Chair: Fei Chiang (McMaster University)

How Large Language Models Will Disrupt Data Management [vision] Raul Castro Fernandez (The University of Chicago)*; Aaron J Elmore (University of Chicago); Michael J Franklin (University of Chicago); Sanjay Krishnan (U Chicago); Chenhao Tan (University of Chicago)
Show Abstract Download Paper

Can Foundation Models Wrangle Your Data? [vision] Avanika Narayan (Stanford University)*; Ines Chami (Numbers Station); Laurel Orr (Stanford University); Christopher Ré (Stanford University)
Show Abstract Download Paper

CatSQL: Towards Real World Natural Language to SQL Applications Han Fu (Alibaba Group)*; Chang Liu (Alibaba Group); Bin Wu (Alibaba Group); Feifei Li (Alibaba Group); Jian Tan (Alibaba); Jianling Sun (Zhejiang University)
Show Abstract Download Paper

Natural language to SQL (NL2SQL) techniques provide a convenient interface to access databases for data analytics and non-expert database users. Existing methods to this problem either employ a rule-base approach or a deep learning-backed solution. Rule-based approaches are hard to generalize across different domains. Deep learning-based solutions generalize better across different domains, but they often result in queries with syntactically or semantically errors which are thus not executable over the underlying database. In this work, we are the first to bridge these two approaches and make novel developments to achieve significant better performance in terms of both accuracy and runtime. In particular, our solution develops a novel CatSQL sketch, which is a template with empty slots, and develop a deep learning model to fill in the slots. Compared with existing sequence-to-sequence-based approach, our sketch-based method does not need to generate keywords which are boilerplates in the template, and thus can achieve better accuracy and run much faster. Compared with previous sketch-based approaches, our CatSQLsketch is more general, which is largely equivalent to standard SQL, and our model can leverage the values filled in one slot when filling in another to improve the performance. In addition, we propose the Semantics Correction technique, which is the first technique leverage database domain knowledge in a deep learning-based NL2SQL solution. Semantics Correction is a post-processing routine, which runs over generated SQL queries and employs rules to identify semantics errors and try to fix them. This technique significantly improves the NL2SQL accuracy. We conduct extensive evaluation on both single-domain and cross-domain benchmarks and demonstrate that our approach can significantly outperform all previous approaches in terms of both accuracy and throughput. In particular, on the state-of-the-art NL2SQL benchmark such as Spider, our CatSQL prototype outperforms the existing state-of-the-art solution by 4 points on accuracy, while achieves an up-to 63x larger throughput.

Hardware Acceleration

Chair: Kyuseok Shim (Seoul National University)

Excalibur: A Virtual Machine for Adaptive Fine-grained JIT-Compiled Query Execution based on VOILA Tim Gubner (CWI); Peter Boncz (CWI)*
Show Abstract Download Paper

Bringing Compiling Databases to RISC Architectures Ferdinand Gruber (Technical University of Munich)*; Maximilian Bandle (TUM); Alexis Engelke (Technical University of Munich); Thomas Neumann (TUM); Jana Giceva (TU Munich)
Show Abstract Download Paper

Deploying Computational Storage for HTAP DBMSs Takes More Than Just Computation Offloading Kitaek Lee (Hanyang University); Insoon Jo (Hanyang University); Jaechan Ahn (Hanyang University); Hyuk Lee (Samsung Electronics); Hwang Lee (Samsung Electronics); Woong Sul (Hanyang University); Hyungsoo Jung (Hanyang University)*
Show Abstract Download Paper

Enabling Transparent Acceleration of Big Data Frameworks Using Heterogeneous Hardware Maria Xekalaki (The University of Manchester); Juan Fumero (The University of Manchester); Athanasios Stratikopoulos (The University of Manchester); Katerina Doka (National Technical University of Athens); Christos Katsakioris (National Technical University of Athens); Constantinos Bitsakos (NTUA); Nectarios Koziris (NTUA); Christos Kotselidis (The University of Manchester)
Show Abstract Download Paper

Optimizing Queries & Beyond

Chair: Renata Borovica-Gajic (University of Melbourne)

Analyzing the Impact of Cardinality Estimation on Execution Plans in Microsoft SQL Server Kukjin Lee (Microsoft); Anshuman Dutt (Microsoft Research)*; Vivek Narasayya (Microsoft); Surajit Chaudhuri (Microsoft Research)
Show Abstract Download Paper

Robust Query Driven Cardinality Estimation under Changing Workloads Parimarjan Negi (MIT CSAIL)*; Ziniu Wu (Massachusetts Institute of Technology); Andreas Kipf (Amazon Web Services); Nesime Tatbul (Intel Labs and MIT); Ryan Marcus (Brandeis University); Samuel Madden (Massachusetts Institute of Technology); Tim Kraska (Massachusetts Institute of Technology); Mohammad Alizadeh (MIT CSAIL)
Show Abstract Download Paper

Query driven cardinality estimation models learn from a historical log of queries. They are lightweight, having low storage, fast inference and training, and easily adaptable for any kind of query. However, they can get unpredictably bad under workload drift, i.e., if the query pattern or data changes. This makes them unreliable and hard to deploy. We analyze the reasons why models become unpredictable due to workload drift, and introduce modifications to the query representation and neural network training techniques that make them robust to the effects of workload drift. First, we emulate workload drift in queries involving some unseen tables or columns by randomly masking out some table or column features during training. This forces the model to make predictions with missing query information, relying more on robust features based on up-to-date DBMS statistics that are useful even when query or data drift happens. Second, we introduce join bitmaps, which extends sampling-based features to be consistent across joins using ideas from sideways information passing. Finally, we show how both of these ideas can be adapted to handle data updates. We show significantly greater generalization than past works across different workloads and databases. For instance, a model trained with our techniques on a simple workload (JOBLight-train), with 40K synthetically generated queries of at most 3 tables each, is able to generalize to the much more complex Join Order Benchmark, which include queries with up to 16 tables, and improve query runtimes by 2x over PostgreSQL. We show similar robustness results with data updates, and across other workloads. We discuss the situations where we expect, and see, improvements, as well as more challenging workload drift scenarios where these techniques do not improve much over PostgreSQL. However, even in the most challenging scenarios, our models do not perform worse than PostgreSQL, while standard query driven models can get much worse than PostgreSQL.

Scaling a Declarative Cluster Manager Architecture with Query Optimization Techniques Kexin Rong (Georgia Institute of Technology)*; Mihai Budiu (VMware Research); Athinagoras Skiadopoulos (Stanford University); Lalith Suresh (VMware Research); Amy Tai (Google)
Show Abstract Download Paper

Leveraging Application Data Constraints to Optimize Database-Backed Web Applications Xiaoxuan Liu (UC Berkeley)*; Shuxian Wang (UC Berkeley); Mengzhu Sun (University of California at Berkeley); Sicheng Pan (UC Berkeley); Ge Li (University of California at Berkeley); Siddharth Jha (UC Berkeley); Cong Yan (Microsoft Research); Junwen Yang (The university of chicago); Shan Lu (University of Chicago); Alvin Cheung (University of California at Berkeley)
Show Abstract Download Paper

QueryBooster: Improving SQL Performance Using Middleware Services for Human-Centered Query Rewriting Qiushi Bai (UC Irvine)*; Sadeem Alsudais (UC Irvine); Chen Li (UC Irvine)
Show Abstract Download Paper

Rethinking Query Optimization & Execution

Chair: Stefanie Scherzinger (University of Passau)

Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction Schedules [vision] Umut Çalıkyılmaz (University of Lübeck)*; Sven Groppe (University of Lübeck); Jinghua Groppe (Universität zu Lübeck); Tobias Winker (IFIS, University of Lübeck); Stefan Prestel (Quantum Brilliance GmbH); Farida Shagieva (Quantum Brilliance GmbH); Daanish Arya (Quantum Brilliance GmbH); Florian Preis (Quantum Brilliance GmbH); Le Gruenwald (The University of Oklahoma)
Show Abstract Download Paper

Asymptotically Better Query Optimization Using Indexed Algebra Philipp Fent (TUM)*; Guido Moerkotte (University of Mannheim); Thomas Neumann (TUM)
Show Abstract Download Paper

SlabCity: Whole-Query Optimization using Program Synthesis Rui Dong (University of Michigan)*; Jie Liu (University of Michigan); Yuxuan Zhu (University of Michigan); Cong Yan (Microsoft research); Barzan Mozafari (University of Michigan); Xinyu Wang (University of Michigan)
Show Abstract Download Paper

Declarative Sub-Operators for Universal Data Processing Michael Jungmair (Technical University of Munich)*; Jana Giceva (TU Munich)
Show Abstract Download Paper

Transaction Processing I

Chair: Michael Abebe (Salesforce)

Fine-Grained Re-Execution for Efficient Batched Commit of Distributed Transactions Zhiyuan Dong (Shanghai Jiao Tong University)*; Zhaoguo Wang (Shanghai Jiao Tong University); Xiaodong Zhang (Shanghai Jiao Tong University); Xian Xu (SJTU); Changgeng Zhao (New York University); Haibo Chen (Shanghai Jiao Tong University); Aurojit Panda (New York University); Jinyang Li (New York University)
Show Abstract Download Paper

Epoxy: ACID Transactions Across Diverse Data Stores Peter Kraft (Stanford University)*; Qian Li (Stanford University); Xinjing Zhou (Massachusetts Institute of Technology); Peter D Bailis (Stanford University); Michael Stonebraker (Massachusetts Institute of Technology); Xiangyao Yu (University of Wisconsin-Madison); Matei Zaharia (Berkeley and Databricks)
Show Abstract Download Paper

Cornus: Atomic Commit for a Cloud DBMS with Storage Disaggregation Zhihan Guo (University of Wisconsin-Madison)*; Xinyu Zeng (University of Wisconsin-Madison); Kan Wu (University of Wisconsin-Madison); Wuh-Chwen Hwang (University of Wisconsin-Madison); Ziwei Ren (University of Wisconsin-Madison); Xiangyao Yu (University of Wisconsin-Madison); Mahesh Balakrishnan (Microsoft Research); Philip A Bernstein (Microsoft Research)
Show Abstract Download Paper

Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks Jinkun Geng (Stanford University)*; Anirudh Sivaraman (New York University); Balaji Prabhakar (Stanford University); Mendel Rosenblum (Stanford University)
Show Abstract Download Paper

Transaction Processing II

Chair: Yongluan Zhou (University of Copenhagen)

TiQuE: Improving the Transactional Performance of Analytical Systems for True Hybrid Workloads Nuno Faria (INESCTEC & U. Minho)*; José Pereira (U. Minho & INESCTEC); Ana Nunes Alonso (INESC TEC & U.Minho); Ricardo Vilaça (INESC TEC and Universidade do Minho); Yunus Koning (MonetDB Solutions); Niels Nes (MonetDB Solutions)
Show Abstract Download Paper

Fries: Fast and Consistent Runtime Reconfiguration in Dataflow Systems with Transactional Guarantees Zuozhi Wang (U C IRVINE)*; Shengquan Ni (U C Irvine); Avinash Kumar (U C IRVINE); Chen Li (UC Irvine)
Show Abstract Download Paper

Scalable and Robust Snapshot Isolation for High-Performance Storage Engines Adnan Alhomssi (Friedrich-Alexander-Universität Erlangen-Nürnberg)*; Viktor Leis (Technische Universität München)
Show Abstract Download Paper

Modern Memory & Storage I

Chair: Srini V. Srinivasan (Aerospike)

WiscSort: External Sorting For Byte-Addressable Storage Vinay Banakar (University of Wisconsin Madison)*; Kan Wu (Google); Yuvraj Patel (University of Edinburgh); Kimberly Keeton (Google); Andrea C Arpaci-Dusseau (University of Wisconsin-Madison); Remzi H Arpaci-Dusseau (University of Wisconsin-Madison)
Show Abstract Download Paper

LRU-C: Parallelizing Database I/Os for Flash SSDs Bo-Hyun Lee (Sungkyunkwan University); Mijin An (Sungkyunkwan University); Sang-Won Lee (Sungkyunkwan University)*
Show Abstract Download Paper

WALTZ: Leveraging Zone Append to Tighten the Tail Latency of LSM Tree on ZNS SSD Jongsung Lee (Seoul National University)*; Donguk Kim (Seoul National University); Jae W. Lee (Seoul National University)
Show Abstract Download Paper

FlashAlloc: Dedicating Flash Blocks By Objects Jonghyeok Park (Hankuk University of Foreign Studies); Soyee Choi (SungKyunKwan University); Gihwan Oh (Sungkyunkwan University); Soojun Im (Samsung Electronics); Moon-Wook Oh (Samsung Electronics); Sang-Won Lee (Sungkyunkwan University)*
Show Abstract Download Paper

Write-Aware Timestamp Tracking: Effective and Efficient Page Replacement for Modern Hardware Demian E Vöhringer (Friedrich-Alexander-Universität Erlangen-Nürnberg)*; Viktor Leis (Technische Universität München)
Show Abstract Download Paper

Modern Memory & Storage II

Chair: Peter Boncz (CWI)

When Database Meets New Storage Devices: Understanding and Exposing Performance Mismatches via Configurations [eab] Haochen He (National University of Defense Technology)*; Erci Xu (NUDT); Shanshan Li (National University of Defense Technology); Zhouyang Jia (National University of Defense Technology); Si Zheng (National University of Defense Technology); Yue Yu (National University of Defense Technology); Jun Ma (National University of Defense Technology); Xiangke Liao (School of Computer Science,National University of Defense Technology)
Show Abstract Download Paper

What Modern NVMe Storage Can Do, And How To Exploit It: High-Performance I/O for High-Performance Storage Engines Gabriel Haas (TUM)*; Viktor Leis (Technische Universität München)
Show Abstract Download Paper

NVM: Is it Not Very Meaningful for Databases? [eab] Dimitrios Koutsoukos (ETHZ)*; Raghav Bhartia (ETH); Michal Friedman (ETH); Ana Klimovic (ETH Zurich); Gustavo Alonso (ETHZ)
Show Abstract Download Paper

NV-SQL: Boosting OLTP Performance with Non-Volatile DIMMs Mijin An (Sungkyunkwan University); Jonghyeok Park (Hankuk University of Foreign Studies); Tianzheng Wang (Simon Fraser University); Beomseok Nam (Sungkyunkwan University); Sang-Won Lee (Sungkyunkwan University)*
Show Abstract Download Paper

Sketching & Streaming

Chair: Alkis Simitsis (Athena Research and Innovation Center)

Dalton: Learned Partitioning for Distributed Data Streams Eleni Zapridou (EPFL)*; Ioannis Mytilinis (EPFL); Anastasia Ailamaki (EPFL)
Show Abstract Download Paper

Efficient framework for operating on data sketches Jakub Lemiesz (Wrocław University of Science and Technology)*
Show Abstract Download Paper

Optimistic Data Parallelism for FPGA-Accelerated Sketching Martin Kiefer (TU Berlin)*; Ilias Poulakis (TU Berlin); Eleni Tzirita Zacharatou (IT University of Copenhagen); Volker Markl (Technische Universität Berlin)
Show Abstract Download Paper

Out-of-Order Sliding-Window Aggregation with Efficient Bulk Evictions and Insertions Kanat Tangwongsan (Mahidol University International College); Martin Hirzel (IBM Research)*; Scott Schneider (Meta)
Show Abstract Download Paper

High-Performance Row Pattern Recognition Using Joins Erkang Zhu (Microsoft Research)*; Silu Huang (Microsoft Research); Surajit Chaudhuri (Microsoft Research)
Show Abstract Download Paper

Benchmarking & Performance I

Chair: Shi Qiao (SmartApps)

HMAB: Self-Driving Hierarchy of Bandits for Integrated Physical Database Design Tuning R. Malinga Perera (University of Melbourne)*; Bastian Oetomo (University of Melbourne); Benjamin I. P. Rubinstein (University of Melbourne); Renata Borovica-Gajic (University of Melbourne)
Show Abstract Download Paper

M2Bench: A Database Benchmark for Multi-Model Analytic Workloads [eab] Bogyeong Kim (Seoul National University); Kyoseung Koo (Seoul National University); Undraa Enkhbat (Seoul National University); Sohyun Kim (Seoul National University Database System Lab); Juhun Kim (Seoul National University); Bongki Moon (Seoul National University)*
Show Abstract Download Paper

Analyzing Vectorized Hash Tables Across CPU Architectures [eab] Maximilian Böther (ETH Zurich)*; Lawrence Benson (HPI, University of Potsdam); Ana Klimovic (ETH Zurich); Tilmann Rabl (HPI, University of Potsdam)
Show Abstract Download Paper

TSM-Bench: Benchmarking Time Series Database Systems for Monitoring Applications [eab] Abdelouahab Khelifati (University of Fribourg)*; Mourad Khayati (University of Fribourg); Anton Dignös (Free University of Bozen-Bolzano); Djellel Difallah (New York University); Philippe Cudré-Mauroux (University of Fribourg)
Show Abstract Download Paper

VeriBench: Analyzing the Performance of Database Systems with Verifiability [eab] Cong Yue (National University of Singapore); Meihui Zhang (Beijing Institute of Technology); Changhao Zhu (Beijing Institute of Technology); Gang Chen (Zhejiang University); Dumitrel Loghin (National University of Singapore); Beng Chin Ooi (NUS)*
Show Abstract Download Paper

Benchmarking & Performance II

Chair: Chenhao Ma (Chinese University of Hong Kong, Shenzhen)

A Deep Dive into Common Open Formats for Analytical DBMSs [eab] Chunwei Liu (Massachusetts Institute of Technology)*; Anna Pavlenko (Microsoft Gray Systems Lab); Matteo Interlandi (Microsoft); Brandon Haynes (Microsoft Gray Systems Lab)
Show Abstract Download Paper

The LDBC Social Network Benchmark: Business Intelligence Workload [eab] Gábor Szárnyas (CWI)*; Jack Waudby (Newcastle University); Benjamin A. Steer (pometry); Dávid Szakállas (LDBC); Altan Birler (TUM); Mingxi Wu (TigerGraph); Yuchen Zhang (TigerGraph); Peter Boncz (CWI)
Show Abstract Download Paper

Cloud Analytics Benchmark [eab] Alexander van Renen (Friedrich-Alexander-Universität Erlangen-Nürnberg)*; Viktor Leis (Technische Universität München)
Show Abstract Download Paper

Cloud DB & Parallelism

Chair: Eric Lo (Chinese University of Hong Kong)

InfiniStore: Elastic Serverless Cloud Storage Jingyuan Zhang (George Mason University)*; Ao Wang (George Mason University); Xiaolong Ma (University of Nevada, Reno); Benjamin Carver (George Mason University); Nicholas John Newman (George Mason University); Ali Anwar (University of Minnesota); Lukas Rupprecht (IBM Research); Vasily Tarasov (IBM Research); Dimitrios Skourtis (Redpanda Data); Feng Yan (University of Houston); Yue Cheng (University of Virginia)
Show Abstract Download Paper

Exploiting Cloud Object Storage for High-Performance Analytics Dominik Durner (TUM)*; Viktor Leis (Technische Universität München); Thomas Neumann (TUM)
Show Abstract Download Paper

Pando: Enhanced Data Skipping with Logical Data Partitioning Sivaprasad Sudhir (Massachusetts Institute of Technology)*; Wenbo Tao (Meta Platforms); Nikolay Laptev (Meta); Cyrille Habis (Meta); Michael Cafarella (MIT CSAIL); Samuel Madden (Massachusetts Institute of Technology)
Show Abstract Download Paper

Parallelism-Optimizing Data Placement for Faster Data-Parallel Computations Nirvik Baruah (Stanford University); Peter Kraft (Stanford University)*; Fiodar Kazhamiaka (Stanford); Peter D Bailis (Stanford University); Matei Zaharia (Berkeley and Databricks)
Show Abstract Download Paper

Tigger: A Database Proxy That Bounces With User-Bypass Matthew Butrovich (Carnegie Mellon University)*; Karthik Ramanathan (Carnegie Mellon University); John Rollinson (Army Cyber Institute); Wan Shen Lim (Carnegie Mellon University); William Zhang (Carnegie Mellon University); Justine Sherry (Carnegie Mellon University); Andrew Pavlo (Carnegie Mellon University)
Show Abstract Download Paper

Modern Memory & Storage III

Chair: El Kindi Rezig (University of Utah)

TreeLine: An Update-In-Place Key-Value Store for Modern Storage Geoffrey X. Yu (Massachusetts Institute of Technology)*; Markos Markakis (Massachusetts Institute of Technology); Andreas Kipf (Amazon Web Services); Per-Åke Larson (University of Waterloo); Umar Farooq Minhas (Apple); Tim Kraska (Massachusetts Institute of Technology)
Show Abstract Download Paper

PIM-tree: A Skew-resistant Index for Processing-in-Memory Hongbo Kang (Tsinghua University)*; Yiwei Zhao (Carnegie Mellon University); Guy E Blelloch (Carnegie Mellon University); Laxman Dhulipala (University of Maryland, College Park); Yan Gu (UC Riverside); Charles McGuffey (Reed University); Phillip B Gibbons (Carnegie Mellon University)
Show Abstract Download Paper

A Design Space Exploration and Evaluation for Main-Memory Hash Joins in Storage Class Memory [eab] Wentao Huang (National University of Singapore)*; Yunhong Ji (Renmin University of China); Xuan Zhou (East China Normal University); Bingsheng He (National University of Singapore); Kian-Lee Tan (National University of Singapore)
Show Abstract Download Paper

Dotori: A Key-Value SSD Based KV Store Carl Duffy (Seoul National University)*; Jaehoon Shim (Seoul National University); Sang-Hoon Kim (Ajou University); Jin-Soo Kim (Seoul National University)
Show Abstract Download Paper

DINOMO: An Elastic, Scalable, High-Performance Key-Value Store for Disaggregated Persistent Memory Sekwon Lee (University of Texas at Austin); Soujanya Ponnapalli (The University of Texas at Austin); Sharad Singhal (Hewlett Packard Labs); Marcos Aguilera (VMware Research); Kimberly Keeton (Google); Vijay Chidambaram (UT Austin and VMWare)
Show Abstract Download Paper

Compression

Chair: Panagiotis Karras (Aarhus University)

Sim-Piece: Highly Accurate Piecewise Linear Approximation through Similar Segment Merging Xenophon Kitsios (Athens University of Economics and Business); Panagiotis Liakos (University of Athens)*; Katia Papakonstantinopoulou (Athens University of Economics and Business); Yannis Kotidis (Athens University of Economics and Business)
Show Abstract Download Paper

The FastLanes Compression Layout: Decoding >100 Billion Integers per Second with Scalar Code Azim Afroozeh (CWI)*; Peter Boncz (CWI)
Show Abstract Download Paper

Toward Quantity-of-Interest Preserving Lossy Compression for Scientific Data Pu Jiao (University of Kentucky); Sheng Di (Argonne National Laboratory, Lemont, IL); Hanqi Guo (The Ohio State University); Kai Zhao (Florida State University); Jiannan Tian (Washington State University); Dingwen Tao (Indiana University); Xin Liang (University of Kentucky)*; Franck Cappello (Argonne National Laboratory, Lemont, IL)
Show Abstract Download Paper

Video Data

Chair: Yao Lu (Microsoft Research)

Extract-Transform-Load for Video Streams Ferdinand Kossmann (Massachusetts Institute of Technology)*; Ziniu Wu (Massachusetts Institute of Technology); Eugenie Y. Lai (Massachusetts Institute of Technology); Nesime Tatbul (Intel Labs and MIT); Lei Cao (University of Arizona/MIT); Tim Kraska (Massachusetts Institute of Technology); Samuel Madden (Massachusetts Institute of Technology)
Show Abstract Download Paper

EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions Enhao Zhang (University of Washington)*; Maureen Daum (University of Washington); Dong He (University of Washington); Brandon Haynes (Microsoft Gray Systems Lab); Ranjay Krishna (University of Washington); Magdalena Balazinska (UW)
Show Abstract Download Paper

Optimizing Video Analytics with Declarative Model Relationships Francisco Romero (Stanford University)*; Johann Hauswald (Stanford University); Aditi Partap (Stanford University); Daniel Kang (Stanford University); Matei Zaharia (Berkeley and Databricks); Christos Kozyrakis (Stanford University)
Show Abstract Download Paper

Time-Series Analytics

Chair: John Paparrizos (Ohio State University)

Choose Wisely: An Extensive Evaluation of Model Selection for Anomaly Detection in Time Series [eab] Emmanouil Sylligardos (FORTH); Paul Boniol (Université de Paris)*; John Paparrizos (The Ohio State University); Panos Trahanias (FORTH); Themis Palpanas (Université Paris Cité)
Show Abstract Download Paper

Time2Feat: Learning Interpretable Representations for Multivariate Time Series Clustering [sds] Angela Bonifati (University of Lyon); Francesco Del Buono (University of Modena e Reggio Emilia); Francesco Guerra (University of Modena e Reggio Emilia); Donato Tiano (Università degli Studi di Modena e Reggio Emilia)*
Show Abstract Download Paper

Motiflets - Simple and Accurate Detection of Motifs in Time Series Patrick Schäfer (Humboldt-Universität zu Berlin)*; Ulf Leser (Humboldt-Universität zu Berlin)
Show Abstract Download Paper

A time series motif intuitively is a short time series that repeats itself approximately the same within a larger time series. Such motifs often represent concealed structures, such as heart beats in an ECG recording, the riff in a pop song, or sleep spindles in EEG sleep data. Motif discovery (MD) is the task of finding such motifs in a given input series. As there are varying definitions of what exactly a motif is, a number of different algorithms exist. As central parameters they all take the length l of the motif and the maximal distance r between the motif's occurrences. In practice, however, especially suitable values for r are very hard to determine upfront, and found motifs show a high variability even for very similar r values. Accordingly, finding an interesting motif with these methods requires extensive trial-and-error. In this paper, we present a different approach to the MD problem. We define k-Motiflets as the set of exactly k occurrences of a motif of length l, whose maximum pairwise distance is minimal. This turns the MD problem upside-down: The central parameter of our approach is not the distance threshold r, but the desired number of occurrence 𝑘 of the motif, which we show is considerably more intuitive and easier to set. Based on this definition, we present exact and approximate algorithms for finding k-Motiflets and analyze their complexity. To further ease the use of our method, we describe statistical tools to automatically determine meaningful values for its input parameters. Thus, for the first time, extracting meaningful motif sets without any a-priori knowledge becomes feasible. By evaluation on several real-world data sets and comparison to four state-of-the-art MD algorithms, we show that our proposed algorithm is both quantitatively superior to its competitors, finding larger motif sets at higher similarity, and qualitatively better, leading to clearer and easier to interpret motifs without any need for manual tuning.

Fast and Scalable Mining of Time Series Motifs with Probabilistic Guarantees Matteo Ceccarello (University of Padova); Johann Gamper (Free University of Bozen-Bolzano, Italy)
Show Abstract Download Paper

OneShotSTL: One-Shot Seasonal-Trend Decomposition For Online Time Series Anomaly Detection And Forecasting Xiao He (Alibaba Group)*; Ye Li (Alibaba); Jian Tan (Alibaba); Bin Wu (Alibaba Group); Feifei Li (Alibaba Group)
Show Abstract Download Paper

Spatial & Multi-Dimesnional Indexing

Chair: Jieming Shi (Hong Kong Polytechnic University)

Towards Designing and Learning Piecewise Space-Filling Curves Jiangneng Li (Nanyang Technological University)*; Zheng Wang (Nanyang Technological University); Gao Cong (Nanyang Technological Univesity); Cheng Long (Nanyang Technological University); Han Mao Kiah (Nanyang Technological University); Bin Cui (Peking University)
Show Abstract Download Paper

Adaptive Indexing of Objects with Spatial Extent Fatemeh Zardbani (Aarhus University); Nikos Mamoulis (University of Ioannina); Stratos Idreos (Harvard); Panagiotis Karras (Aarhus University)*
Show Abstract Download Paper

Adaptive Indexing in High-Dimensional Metric Spaces Konstantinos Lampropoulos (University of Ioannina); Fatemeh Zardbani (Aarhus University); Nikos Mamoulis (University of Ioannina)*; Panagiotis Karras (Aarhus University)
Show Abstract Download Paper

Waffle: A Workload-Aware and Query-Sensitive Framework for Disk-Based Spatial Indexing Moin Hussain Moti (HKUST)*; Panagiotis Simatis (HKUST); Dimitris Papadias (HKUST)
Show Abstract Download Paper

Data Samples & Summaries

Chair: Silu Huang (Microsoft Research)

Bayesian Sketches for Volume Estimation in Data Streams Francesco Da Dalt (ETH Zürich)*; Simon Scherrer (ETH Zurich); Adrian Perrig (ETH Zurich)
Show Abstract Download Paper

Panakos: Chasing the Tails for Multidimensional Data Streams Fuheng Zhao (UCSB)*; Punnal Ismail Khan (UCSB); Divyakant Agrawal (University of California at Santa Barbara); Amr El Abbadi (UC Santa Barbara); Arpit Gupta (University of California at Santa Barbara); Zaoxing Liu (Boston University)
Show Abstract Download Paper

Efficient Approximation of Certain and Possible Answers for Ranking and Window Queries over Uncertain Data Su Feng (Illinois Institute of Technology)*; Boris Glavic (Illinois Institute of Technology); Oliver A Kennedy (University at Buffalo, SUNY)
Show Abstract Download Paper

High-Dimensional Data Cubes Sachin Basil John (EPFL); Christoph Koch (EPFL, Switzerland)
Show Abstract Download Paper

No Repetition: Fast and Reliable Sampling with Highly Concentrated Hashing Anders Aamand (MIT); Debarati Das (BARC - Basic Algorithms Research Copenhagen, University of Copenhagen); Evangelos Kipouridis (BARC - Basic Algorithms Research Copenhagen, University of Copenhagen); Jakob B.T. Knudsen (BARC - Basic Algorithms Research Copenhagen, University of Copenhagen); Peter M.R. Rasmussen (BARC - Basic Algorithms Research Copenhagen, University of Copenhagen); Mikkel Thorup (BARC - Basic Algorithms Research Copenhagen, University of Copenhagen)
Show Abstract Download Paper

Similarity Search

Chair: Dong Deng (Rutgers University)

Accelerating Similarity Search for Elastic Measures: A Study and New Generalization of Lower Bounding Distances [eab] John Paparrizos (The Ohio State University)*; Kaize Wu (University of Chicago); Aaron J Elmore (University of Chicago); Christos Faloutsos (Carnegie Mellon University); Michael J Franklin (University of Chicago)
Show Abstract Download Paper

MQH: Locality Sensitive Hashing on Multi-level Quantization Errors for Point-to-Hyperplane Distances Kejing Lu (Nagoya University)*; Yoshiharu Ishikawa (Nagoya University); Chuan Xiao (Osaka University, Nagoya University)
Show Abstract Download Paper

FARGO: Fast Maximum Inner Product Search via Global Multi-Probing Xi Zhao (Huazhong University of Science and Technology); Bolong Zheng (Huazhong University of Science and Technology)*; Xiaomeng Yi (Zhejiang Lab); Xiaofan Luan (ZilliZ); Charles Xie (Zilliz); Xiaofang Zhou (The Hong Kong University of Science and Technology); Christian S. Jensen (Aalborg University)
Show Abstract Download Paper

Odyssey: A Journey in the Land of Distributed Data Series Similarity Search Manos Chatzakis (EPFL)*; Panagiota Fatourou (University of Crete); Eleftherios Kosmas (University of Crete); Themis Palpanas (Université Paris Cité); Botao Peng (Institute of Computing Technology, Chinese Academy of Sciences)
Show Abstract Download Paper

Elpis: Graph-Based Similarity Search for Scalable Data Science [sds] Ilias Azizi (Mohammed VI Polytechnic University)*; Karima Echihabi (Mohammed VI Polytechnic University); Themis Palpanas (Université Paris Cité)
Show Abstract Download Paper

Matching & Spatial Crowdsourcing

Chair: Xiaohui Yu (York University)

Privacy-preserving Cooperative Online Matching over Spatial Crowdsourcing Platforms Yi Yang (Beijing Institute of Technology)*; Yurong Cheng (Beijing institute of technology); Ye Yuan (Beijing Institute of Technology); Guoren Wang (Beijing Institute of Technology); Lei Chen (Hong Kong University of Science and Technology); Yongjiao Sun (Northeastern University)
Show Abstract Download Paper

ACTA: Autonomy and Coordination Task Assignment in Spatial Crowdsourcing Platforms Boyang Li (Beijing Institute of Technology)*; Yurong Cheng (Beijing institute of technology); Ye Yuan (Beijing Institute of Technology); Yi Yang (Beijing Institute of Technology); Qianqian Jin (Beijing Institute of Technolog China); Guoren Wang (Beijing Institute of Technology)
Show Abstract Download Paper

Online Ridesharing with Meeting Points Jiachuan Wang (HKUST); Peng Cheng (East China Normal University); Libin Zheng (Sun Yat-sen University); Lei Chen (Hong Kong University of Science and Technology); Wenjie Zhang (University of New South Wales)
Show Abstract Download Paper

k-Best Egalitarian Stable Marriages for Task Assignment Siyuan Wu (University of Macau)*; Leong Hou U (University of Macau); Panagiotis Karras (Aarhus University)
Show Abstract Download Paper

Trajectories & Time Series

Chair: Dujian Ding (University of British Columbia)

A Deep Generative Model for Trajectory Modeling and Utilization Yong Wang (Tsinghua University); Guoliang Li (Tsinghua University)*; Kaiyu Li (Tsinghua University); Haitao Yuan (Baidu)
Show Abstract Download Paper

Efficient Non-Learning Similar Subtrajectory Search Jiabao Jin (East China Normal University); Peng Cheng (East China Normal University)*; Lei Chen (Hong Kong University of Science and Technology); Xuemin Lin (Shanghai Jiaotong University); Wenjie Zhang (University of New South Wales)
Show Abstract Download Paper

Effective and Efficient Route Planning Using Historical Trajectories on Road Networks Wei Tian (The Hong Kong Polytechnic University)*; Jieming Shi (The Hong Kong Polytechnic University); Siqiang Luo (Nanyang Technological University); Hui Li (Xiamen University); Xike Xie (University of Science and Technology of China); Yuanhang Zou (Tencent)
Show Abstract Download Paper

iEDeaL: A Deep Learning Framework for Detecting Highly Imbalanced Interictal Epileptiform Discharges [sds] Qitong Wang (Université Paris Cité)*; Stephen Whitmarsh (Sorbonne Université, Paris Brain Institute - ICM, Inserm, CNRS, APHP, Pitié-Salpêtrière Hospital); Vincent Navarro (Sorbonne Université, Paris Brain Institute - ICM, Inserm, CNRS, APHP, Pitié-Salpêtrière Hospital); Themis Palpanas (Université Paris Cité)
Show Abstract Download Paper

Scalable ML I

Chair: Rajesh Bordawekar (IBM Research)

Scalable Graph Convolutional Network Training on Distributed-Memory Systems Gunduz Vehbi Demirci (University of Warwick)*; Aparajita Haldar (University of Warwick); Hakan Ferhatosmanoglu (University of Warwick)
Show Abstract Download Paper

FastFlow: Accelerating Deep Learning Model Training with Smart Offloading of Input Data Pipeline Taegeon Um (Samsung Research)*; Byungsoo Oh (Samsung Research); Byeongchan Seo (Samsung Research); Minhyeok Kweun (samsung research); Goeun Kim (Samsung Research); Woo-Yeon Lee (Samsung Research)
Show Abstract Download Paper

MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud Zhen Zhang (Johns Hopkins University)*; Shuai Zheng (Amazon Web Services); Yida Wang (Amazon); Justin Chiu (Amazon); George Karypis (Amazon); Trishul A Chilimbi (Amazon); Mu Li (Amazon); Xin Jin (Peking University)
Show Abstract Download Paper

Data Discovery & Learning over Related Data

Chair: Fatemeh Nargesian (University of Rochester)

JoinBoost: Grow Trees Over Normalized Data Using Only SQL Zezhou Huang (Columbia University)*; Rathijit Sen (Microsoft); Jiaxiang Liu (Columbia University); Eugene Wu (Columbia University)
Show Abstract Download Paper

Cross Modal Data Discovery over Structured and Unstructured Data Lakes Mohamed Y. Eltabakh (Worcester Polytechnic Institute)*; Mayuresh Kunjir (Amazon AWS); Ahmed K. Elmagarmid (QCRI); Mohammad Shahmeer Ahmad (Qatar Computing Research Institute)
Show Abstract Download Paper

Organizations are collecting increasingly large amounts of data for data-driven decision making. These data are often dumped into a centralized repository, e.g., a data lake, consisting of thousands of structured and unstructured datasets. Perversely, such mixture makes the problem of discovering tables or documents that are relevant to a user's query very challenging. Despite the recent efforts in data discovery, the problem remains widely open especially in the two fronts of (1) discovering relationships and relatedness across structured and unstructured datasets--where existing techniques suffer from either scalability, being customized for a specific problem type (e.g., entity matching or data integration), or demolishing the structural properties on its way, and (2) developing a holistic system for integrating various similarity measurements and sketches in an effective way to boost the discovery accuracy. In this paper, we propose a new data discovery system, named CMDL, for addressing these two limitations. CMDL supports the data discovery process over both structured and unstructured data while retaining the structural properties of tables. As a result, CMDL is the only system to date that empowers end-users to seamlessly pipeline the discovery tasks across the two modalities. We propose a novel multi-modal embedding representation that captures the similarities between text documents and tabular columns. The model training relies on labeled datasets generated though weak supervision, and thus the system is domain agnostic and easily generalizable. We evaluate CMDL on three real-world data lakes with diverse applications and show that our system is significantly more effective for cross-modality discovery compared to the search-based baseline techniques. Moreover, CMDL is more accurate and robust to different data types and distributions compared to the state-of-the-art systems that are limited to only the structured datasets.

RECA: Related Tables Enhanced Column Semantic Type Annotation Framework Yushi Sun (Hong Kong University of Science and Technology)*; Hao Xin (Hong Kong University of Science and Technology); Lei Chen (Hong Kong University of Science and Technology)
Show Abstract Download Paper

Scalable ML II

Chair: Alekh Jindal (SmartApps)

SubStrat: A Subset-Based Optimization Strategy for Faster AutoML [sds] Teddy Lazebnik (University College London); Amit Somech (Bar-Ilan University)*; Abraham Itzhak Weinberg (Bar Ilan University)
Show Abstract Download Paper

FederatedScope: A Flexible Federated Learning Platform for Heterogeneity Yuexiang Xie (Alibaba Group); Zhen Wang (Alibaba Group); Dawei Gao (Alibaba-inc); Daoyuan Chen (Alibaba Group); Liuyi Yao (Alibaba Group); Weirui Kuang (Alibaba Group); Yaliang Li (Alibaba Group)*; Bolin Ding (Data Analytics and Intelligence Lab, Alibaba Group); Jingren Zhou (Alibaba Group)
Show Abstract Download Paper

Optimizing Tensor Programs on Flexible Storage [sigmod] Maximilian Joel Schleich (RelationalAI); Amir Shaikhha (University of Edinburgh)*; Dan Suciu (University of Washington)
Show Abstract Download Paper

Towards Observability for Production Machine Learning Pipelines [vision] Shreya Shankar (University of California Berkeley); Aditya Parameswaran (University of California, Berkeley)
Show Abstract Download Paper

Causality & Explanation

Chair: Davood Rafiei (University of Alberta)

Causal Data Integration [vision] Brit Youngmann (Massachusetts Institute of Technology)*; Michael Cafarella (MIT CSAIL); Babak Salimi (University of California at San Diego); Anna Zeng (Massachusetts Institute of Technology)
Show Abstract Download Paper

HENCE-X: Toward Heterogeneity-agnostic Multi-level Explainability for Deep Graph Networks Ge Lv (The Hong Kong University of Science and Technology)*; Chen Jason Zhang (The Hong Kong Polytechnic University); Lei Chen (HKUST)
Show Abstract Download Paper

POEM: Pattern-Oriented Explanations of Convolutional Neural Networks [sds] Vargha Dadvar (University of Waterloo); Lukasz Golab (University of Waterloo)*; Divesh Srivastava (AT&T Chief Data Office)
Show Abstract Download Paper

On Data-Aware Global Explainability of Graph Neural Networks Ge Lv (The Hong Kong University of Science and Technology)*; Lei Chen (HKUST)
Show Abstract Download Paper

Computing Rule-Based Explanations by Leveraging Counterfactuals Zixuan Geng (University of Washington)*; Maximilian Schleich (RelationalAI); Dan Suciu (University of Washington)
Show Abstract Download Paper

Fairness

Chair: Steven Whang (KAIST)

Why Not Yet: Fixing a Top-k Ranking that Is Not Fair to Individuals Zixuan Chen (Northeastern University)*; Panagiotis Manolios (Northeastern University); Mirek Riedewald (Northeastern University)
Show Abstract Download Paper

Consistent Range Approximation for Fair Predictive Modeling Jiongli Zhu (University of California San Diego)*; Sainyam Galhotra (University of Chicago); Nazanin Sabri (University of California at San Diego); Babak Salimi (University of California at San Diego)
Show Abstract Download Paper

Models and Mechanisms for Spatial Data Fairness Sina Shaham (University of Southern California); Gabriel Ghinita (Hamad Bin Khalifa University)*; Cyrus Shahabi (Computer Science Department. University of Southern California)
Show Abstract Download Paper

Satisfying Complex Top-k Fairness Constraints by Preference Substitutions Md. Mouinul Islam (New Jersey Institute of Technology); Dong Wei (NJIT); Baruch Schieber (New Jersey Institute of Technology); Senjuti Basu Roy (New Jersey Institute of Technology)*
Show Abstract Download Paper

Graph Analytics I

Chair: Sibo Wang (Chinese University of Hong Kong)

SUFF: Accelerating Subgraph Matching with Historical Data Xun Jian (HKUST)*; Zhiyuan Li (The Hong Kong University of Science and Technology); Lei Chen (Hong Kong University of Science and Technology)
Show Abstract Download Paper

Computing Graph Edit Distance via Neural Graph Matching Chengzhi Piao (Chinese University of Hong Kong); Tingyang Xu (Tencent AI Lab); Xiangguo Sun (CUHK); Yu Rong (Tencent AI Lab); Kangfei Zhao (Beijing Insitute of Technology); Hong Cheng (Chinese University of Hong Kong)*
Show Abstract Download Paper

Graph edit distance (GED) computation is a fundamental NP-hard problem in graph theory. Given a graph pair $(G_1, G_2)$, GED is defined as the minimum number of primitive operations converting $G_1$ to $G_2$. Early studies focus on search-based inexact algorithms such as A*-beam search, and greedy algorithms using bipartite matching due to its NP-hardness. They can obtain a sub-optimal solution by constructing an edit path (the sequence of operations that converts $G_1$ to $G_2$). Recent studies convert the GED between a given graph pair $(G_1, G_2)$ into a similarity score in the range $(0, 1)$ by a well designed function. Then machine learning models (mostly based on graph neural networks) are applied to predict the similarity score. They achieve a much higher numerical precision than the sub-optimal solutions found by classical algorithms. However, a major limitation is that these machine learning models cannot generate an edit path. They treat the GED computation as a pure regression task to bypass its intrinsic complexity, but ignore the essential task of converting $G_1$ to $G_2$. This severely limits the interpretability and usability of the solution. In this paper, we propose a novel deep learning framework that solves the GED problem in a two-step manner: 1) The proposed graph neural network GEDGNN is in charge of predicting the GED value and a matching matrix; and 2) A post-processing algorithm based on $k$-best matching is used to derive $k$ possible node matchings from the matching matrix generated by GEDGNN. The best matching will finally lead to a high-quality edit path. Extensive experiments are conducted on three real graph data sets and synthetic power-law graphs to demonstrate the effectiveness of our framework. Compared to the best result of existing GNN-based models, the mean absolute error (MAE) on GED value prediction decreases by $4.9\% \sim 74.3\%$. Compared to the state-of-the-art searching algorithm Noah, the MAE on GED value based on edit path reduces by $53.6\% \sim 88.1\%$.

ARKGraph: All-Range Approximate K-Nearest-Neighbor Graph Chaoji Zuo (Rutgers University - New Brunswick); Dong Deng (Rutgers University - New Brunswick)*
Show Abstract Download Paper

Given a collection of vectors, the approximate K-nearest-neighbor graph (KGraph for short) connects every vector to its approximate K-nearest-neighbors (KNN for short). KGraph plays an important role in data visualization, semantic search, manifold learning, and machine learning. The vectors are typically vector representations of real-world objects (e.g., images and documents), which often come with a few structured attributes, such as timestamps and locations. In this paper, we study the all-range approximate K-nearest-neighbor graph (ARKGraph) problem. Specifically, given a collection of vectors, each associated with a numerical search key value (e.g., a timestamp), we aim to build an index that takes a search key range as the query and returns the KGraph of vectors whose search keys are within the query range. ARKGraph can facilitate interactive high dimensional data visualization, data mining, etc. A key challenge of this problem is the huge index size. This is because, given $n$ vectors, a brute-force index stores a KGraph for every search key range, which results in $O(Kn^3)$ index size as there are $O(n^2)$ search key ranges and each KGraph takes $O(Kn)$ space. We observe that the KNN of a vector in nearby ranges are often the same, which can be grouped together to save space. Based on this observation, we propose a series of novel techniques that reduce the index size significantly to just $O(Kn\log n)$ in the average case. Furthermore, we develop an efficient indexing algorithm that constructs the optimized ARKGraph index directly without exhaustively calculating the distance between every pair of vectors. To process a query, for each vector in the query range, we only need $O(\log\log n + K\log K)$ to restore its KNN in the query range from the optimized ARKGraph index. We conducted extensive experiments on real-world datasets. Experimental results show that our optimized ARKGraph index achieved a small index size, low query latency, and good scalability. Specifically, our approach was 1000x faster than the baseline method that builds a KGraph for all the vectors in the query range on-the-fly.

Quasi-stable Coloring for Graph Compression: Approximating Max-Flow, Linear Programs, and Centrality Moe Kayali (University of Washington)*; Dan Suciu (University of Washington)
Show Abstract Download Paper

Graph Analytics II

Chair: Arijit Khan (Nanyang Technological University)

Temporal SIR-GN: Efficient and Effective Structural Representation Learning for Temporal Graphs Janet Layne (Boise State University); Justin Carpenter (Boise State University); Edoardo Serra (Boise State University); Francesco Gullo (UniCredit)*
Show Abstract Download Paper

SUREL+: Moving from Walks to Sets for Scalable Subgraph-based Graph Representation Learning [sds] Haoteng Yin (Purdue University)*; Muhan Zhang (Peking University); Jianguo Wang (Purdue University); Pan Li (Georgia Tech.)
Show Abstract Download Paper

Scaling Up Structural Clustering to Large Probabilistic Graphs Using Lyapunov Central Limit Theorem Joseph N Howie (University of Victoria)*; Venkatesh Srinivasan (university of victoria); Alex Thomo (University of Victoria)
Show Abstract Download Paper

Efficient Maximum k-Plex Computation over Large Sparse Graphs Lijun Chang (The University of Sydney)*; Mouyi Xu (The University of Sydney); Darren Strash ()
Show Abstract Download Paper

Text Processing & Search

Chair: Hazar Harmouch (Hasso Plattner Institute)

Pollock: A Data Loading Benchmark [eab] Gerardo Vitagliano (Hasso Plattner Institute)*; Mazhar Hameed (Hasso Plattner Institute); Lan Jiang (Hasso Plattner Institute); Lucas Reisener (Hasso Plattner Institute); Eugene Wu (Columbia University); Felix Naumann (Hasso Plattner Institute, University of Potsdam)
Show Abstract Download Paper

Text Indexing for Long Patterns: Anchors are All you Need Lorraine A. K. Ayad (Brunel University); Grigorios Loukides (King's College London)*; Solon P. Pissis (CWI)
Show Abstract Download Paper

Autonomously Computable Information Extraction Besat Kassaie (University of Waterloo); Frank Wm. Tompa (University of Waterloo)*
Show Abstract Download Paper

REmatch: a novel regex engine for finding all matches Cristian Riveros (PUC Chile); Nicolás Van Sint Jan (PUC); Domagoj Vrgoč (PUC)*
Show Abstract Download Paper

Web Record Extraction with Invariants Zhijia Chen (Temple University)*; Weiyi Meng (Binghamton University); Eduard Dragut (Temple University)
Show Abstract Download Paper

Graph Analytics III

Chair: Siqiang Luo (Nanyang Technological University)

Distributed Graph Embedding with Information-Oriented Random Walks Peng Fang (Huazhong University of Science and Technology)*; Arijit Khan (Aalborg University); Siqiang Luo (Nanyang Technological University); Fang Wang (Huazhong University of Science and Technology); Dan Feng (Huazhong University of Science and Technology); Zhenli Li (Huazhong University of Science and Technology); Wei Yin (Huazhong University of Science and Technology); Yuchao Cao (Huazhong University of Science and Technology)
Show Abstract Download Paper

Decoupled Graph Neural Networks for Large Dynamic Graphs [sds] Yanping Zheng (Renmin University of China)*; Zhewei Wei (Renmin University of China); Jiajun Liu (CSIRO)
Show Abstract Download Paper

Estimating Single-Node PageRank in $\tilde{O}\left(\min\{d_t, \sqrt{m}\}\right)$ Time Hanzhi Wang (Renmin University of China)*; Zhewei Wei (Renmin University of China)
Show Abstract Download Paper

Space-Efficient Random Walks on Streaming Graphs Serafeim Papadias (TU Berlin)*; Zoi Kaoudi (TU Berlin); Jorge-Arnulfo Quiané-Ruiz (IT University of Copenhagen); Volker Markl (Technische Universität Berlin)
Show Abstract Download Paper

Density Personalized Group Query Chih-Ya Shen (National Tsing Hua University); Shao-Heng Ko (Academia Sinica); Guang-Siang Lee (Academia Sinica); Wang-Chien Lee (Pennsylvania State University, USA); De-Nian Yang (Academia Sinica)*
Show Abstract Download Paper

Searching Data

Chair: Kaiyu Li (York University)

A Generic Framework for Efficient Computation of Top-k Diverse Results [vldbj] Md Mouinul Islam (New Jersey Institute of Technology); Mahsa Asadi (New Jersey Institute of Technology); Sihem Amer-Yahia (CNRS: Centre National de la Recherche Scientifique); Senjuti Basu Roy (New Jersey Institute of Technology)*
Show Abstract Download Paper

Survey of Window Types for Aggregation in Stream Processing Systems [vldbj] Juliane Verwiebe (Technische Universität Berlin); Philipp M Grulich (Technische Universität Berlin)*; Jonas Traub (Technische Universität Berlin); Volker Markl (Technische Universität Berlin)
Show Abstract Download Paper

A Survey on Deep Learning Approaches for Text-to-SQL [vldbj] George Katsogiannis (Athena Research and Innovation Center)*; Georgia Koutrika (Athena Research and Innovation Center)
Show Abstract Download Paper

Graph Analytics IV

Chair: Laks V.S. Lakshmanan (University of Britsh Columbia)

MiniGraph: Querying Big Graphs with a Single Machine Xiaoke Zhu (Beihang University); Yang Liu (Beihang University); Shuhao Liu (Shenzhen Institute of Computing Sciences)*; Wenfei Fan (University of Edinburgh)
Show Abstract Download Paper

Parallel Colorful h-star Core Maintenance in Dynamic Graphs Sen Gao (National University of Singapore)*; Hongchao Qin (Beijing Institute of Technology); Ronghua Li (Beijing Institute of Technology); Bingsheng He (National University of Singapore)
Show Abstract Download Paper

MITra: A Framework for Multi-Instance Graph Traversal Jia Li (Edinburgh Research Center, Central Software Institute, Huawei); Wenyue Zhao (University of Edinburgh); Nikos Ntarmos (Edinburgh Research Center, Central Software Institute, Huawei); Yang Cao (University of Edinburgh)*; Peter Buneman (The University of Edinburgh)
Show Abstract Download Paper

Sage: A System for Uncertain Network Analysis Eunjae Lee (UNIST); Sam H. Noh (UNIST); Jiwon Seo (Hanyang University)
Show Abstract Download Paper

Community Search in Graphs

Chair: Yixiang Fang (University of Hong Kong, Shenzhen)

Influential Community Search over Large Heterogeneous Information Networks Yingli Zhou (The Chinese University of Hong Kong, Shenzhen)*; Yixiang Fang (The Chinese University of Hong Kong, Shenzhen); Wensheng Luo (School of Data Science, The Chinese University of Hong Kong, Shenzhen); Yunming Ye (Harbin Institute of Technology Shenzhen Graduate School)
Show Abstract Download Paper

Maximal D-truss Search in Dynamic Directed Graphs Anxin Tian (Hong Kong University of Science and Technology)*; Alexander Zhou (Hong Kong University of Science and Technology); Yue Wang (Shenzhen Institute of Computing Sciences); Lei Chen (Hong Kong University of Science and Technology)
Show Abstract Download Paper

Neighborhood-based Hypergraph Core Decomposition Naheed Anjum Arafat (Nanyang Technological University)*; Arijit Khan (Aalborg University); Arpit Kumar Rai (Indian Institute of Technology, Kanpur); Bishwamittra Ghosh (National University of Singapore)
Show Abstract Download Paper

Data Discovery & Integration

Chair: Oktie Hassanzadeh (IBM Research)

Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning Grace Fan (Northeastern University)*; Jin Wang (Megagon Labs); Yuliang Li (Megagon Labs); Dan Zhang (Megagon Labs); Renée J. Miller (Northeastern University)
Show Abstract Download Paper

DeepJoin: Joinable Table Discovery with Pre-trained Language Models Yuyang Dong (NEC corporation)*; Chuan Xiao (Osaka University, Nagoya University); Takuma Nozawa (NEC); Masafumi Enomoto (NEC); Masafumi Oyamada (NEC)
Show Abstract Download Paper

Effective Entity Augmentation By Querying External Data Sources Christopher Buss (Oregon State University)*; Jasmin Mousavi (Oregon State University); Mikhail Tokarev (Oregon State University); Arash Termehchy (Oregon State University); David Maier (Portland State University); Stefan Lee (Oregon State University)
Show Abstract Download Paper

Integrating Data Lake Tables Aamod Khatiwada (Northeastern University)*; Roee Shraga (Northeastern University); Wolfgang Gatterbauer (Northeastern University); Renée J. Miller (Northeastern University)
Show Abstract Download Paper

VersaMatch: Ontology Matching with Weak Supervision Jonathan Fürst (ZHAW)*; Mauricio Fadel Argerich (NEC Laboratories Europe); Bin Cheng (NEC Laboratories Europe)
Show Abstract Download Paper

Private Retrieval & Secure Execution I

Chair: Mostafa Milani (Univesity of Western Ontario)

Secure Shapley Value for Cross-Silo Federated Learning Shuyuan Zheng (Kyoto University)*; Yang Cao (Hokkaido University); Masatoshi Yoshikawa (Osaka Seikei University)
Show Abstract Download Paper

Olive: Oblivious Federated Learning on Trusted Execution Environment Against the Risk of Sparsification Fumiyuki Kato (Kyoto University)*; Yang Cao (Hokkaido University); Masatoshi Yoshikawa (Osaka Seikei University)
Show Abstract Download Paper

L2chain: Towards High-performance, Confidential and Secure Layer-2 Blockchain Solution for Decentralized Applications Zihuan Xu (Hong Kong University of Science and Technology); Lei Chen (Hong Kong University of Science and Technology)*
Show Abstract Download Paper

SPG: Structure-Private Graph Database via SqueezePIR Ling Liang (UCSB)*; Jilan Lin (UCSB); Zheng Qu (University of California at Santa Barbara); Ishtiyaque Ahmad (University of California at Santa Barbara); Fengbin Tu (UCSB); Trinabh Gupta (UCSB); Yufei Ding (University of California at Santa Barbara); Yuan Xie (University of California at Santa Barbara)
Show Abstract Download Paper

Private Retrieval & Secure Execution II

Chair: Miti Mazmudar (University of Waterloo)

Pantheon: Private Retrieval from Public Key-Value Store Ishtiyaque Ahmad (University of California at Santa Barbara)*; Divyakant Agrawal (University of California at Santa Barbara); Amr El Abbadi (UC Santa Barbara); Trinabh Gupta (UCSB)
Show Abstract Download Paper

Information-Theoretically Secure and Highly Efficient Search and Row Retrieval Shantanu Sharma (New Jersey Institute of Technology)*; Yin Li (Dongguan University of Technology); Sharad Mehrotra (U.C. Irvine); Nisha Panwar (Augusta University); Komal Kumari (New Jersey Institute of Technology); Swagnik Roychoudhury (New York University)
Show Abstract Download Paper

ZKSQL: Verifiable and Efficient Query Evaluation with Zero-Knowledge Proofs Xiling Li (Northwestern University)*; Chenkai Weng (Northwestern University); Yongxin Xu (Northwestern University); Xiao Wang (Northwestern University); Jennie Rogers (Northwestern University)
Show Abstract Download Paper

Cracking-Like Join for Trusted Execution Environments Kajetan Maliszewski (TU Berlin)*; Jorge-Arnulfo Quiané-Ruiz (IT University of Copenhagen); Volker Markl (Technische Universität Berlin)
Show Abstract Download Paper

Enabling Secure and Efficient Data Analytics Pipeline Evolution with Trusted Execution Environment Haotian Gao (National University of Singapore); Cong Yue (National University of Singapore); Tien Tuan Anh Dinh (Deakin University); Zhiyong Huang (NUS School of Computing); Beng Chin Ooi (NUS)*
Show Abstract Download Paper

Blockchains

Chair: Sujaya Maiyya (University of Waterloo)

GlassDB: An Efficient Verifiable Ledger Database System Through Transparency Cong Yue (National University of Singapore); Tien Tuan Anh Dinh (Deakin University); Zhongle Xie (National University of Singapore); Meihui Zhang (Beijing Institute of Technology); Gang Chen (Zhejiang University); Beng Chin Ooi (NUS)*; Xiaokui Xiao (National University of Singapore)
Show Abstract Download Paper

FlexChain: An Elastic Disaggregated Blockchain Chenyuan Wu (University of Pennsylvania)*; Mohammad Javad Amiri (University of Pennsylvania); Jared Asch (University of Pennsylvania); Heena Nagda (University of Pennsylvania); Qizhen Zhang (University of Pennsylvania); Boon Thau Loo (University of Pennsylvania)
Show Abstract Download Paper

GriDB: Scaling Blockchain Database via Sharding and Off-Chain Cross-Shard Mechanism Zicong Hong (The Hong Kong Polytechnic University)*; Song Guo (The Hong Kong Polytechnic University); Enyuan Zhou (The Hong Kong Polytechnic University); Wuhui Chen (Sun Yat-sen University); Huawei Huang (Sun Yat-sen University); Albert Zomaya (The University of Sydney)
Show Abstract Download Paper

Blockchain databases have attracted widespread attention but suffer from poor scalability due to underlying non-scalable blockchains. While blockchain sharding is necessary for a scalable blockchain database, it poses a new challenge named on-chain cross-shard database services. Each cross-shard database service (e.g., cross-shard queries or inter-shard load balancing) involves massive cross-shard data exchanges, while the existing cross-shard mechanisms need to process each cross-shard data exchange via the consensus of all nodes in the related shards (i.e., on-chain) to resist a Byzantine environment of blockchain, which eliminates sharding benefits. To tackle the challenge, this paper presents GriDB, the first scalable blockchain database, by designing a novel off-chain cross-shard mechanism for efficient cross-shard database services. Borrowing the idea of off-chain payments, GriDB delegates massive cross-shard data exchange to a few nodes, each of which is randomly picked from a different shard. Considering the Byzantine environment, the untrusted delegates cooperate to generate succinct proof for cross-shard data exchanges, while the consensus is only responsible for the low-cost proof verification. However, different from payments, the database services' verification has more requirements (e.g., completeness, correctness, freshness, and availability); thus, we introduce several new authenticated data structures (ADS). Particularly, we utilize consensus to extend the threat model and reduce the complexity of traditional accumulator-based ADS for verifiable cross-shard queries with a rich set of relational operators. Moreover, we study the necessity of inter-shard load balancing for a scalable blockchain database and design an off-chain and live approach for both efficiency and availability during balancing. An evaluation of our prototype shows the performance of GriDB in terms of scalability in workloads with queries and updates.

AdaChain: A Learned Adaptive Blockchain Chenyuan Wu (University of Pennsylvania)*; Bhavana Mehta (University of Pennsylvania); Mohammad Javad Amiri (University of Pennsylvania); Ryan Marcus (University of Pennsylvania); Boon Thau Loo (University of Pennsylvania)
Show Abstract Download Paper

Differential Privacy I

Chair: Xi He (University of Waterloo)

Answering Private Linear Queries Adaptively using the Common Mechanism Yingtai Xiao (Pennsylvania State University)*; Guanhong Wang (University of Maryland); Danfeng Zhang (Penn State); Daniel Kifer (Penn State)
Show Abstract Download Paper

Epistemic Parity: Reproducibility as an Evaluation Metric for Differential Privacy [eab] Lucas Rosenblatt (New York University)*; Bernease Herman (University of Washington); Anastasia Holovenko (Ukrainian Catholic University); Wonkwon Lee (New York University); Joshua Loftus (London School of Economics); Elizabeth McKinnie (Microsoft); Taras Rumezhak (Ukrainian Catholic University); Andrii Stadnik (Ukrainian Catholic University); Bill Howe (University of Washington); Julia Stoyanovich (New York University)
Show Abstract Download Paper

Differential privacy (DP) data synthesizers are increasingly proposed to afford public release of sensitive information, offering theoretical guarantees for privacy (and, in some cases, utility), but limited empirical evidence of utility in practical settings. Utility is typically measured as the error on representative proxy tasks, such as descriptive statistics, multivariate correlations, the accuracy of trained classifiers, or performance over a query workload. The ability for these results to generalize to practitioners' experience has been questioned in a number of settings, including the U.S. Census. In this paper, we propose an evaluation methodology for synthetic data that avoids assumptions about the representativeness of proxy tasks, instead measuring the likelihood that published conclusions would change had the authors used synthetic data, a condition we call epistemic parity. Our methodology consists of reproducing empirical conclusions of peer-reviewed papers on real, publicly available data, then re-running these experiments a second time on DP synthetic data and comparing the results. We instantiate our methodology over a benchmark of recent peer-reviewed papers that analyze public datasets in the ICPSR social science repository. We model quantitative claims computationally to automate the experimental workflow, and model qualitative claims by reproducing visualizations and comparing the results manually. We then generate DP synthetic datasets using multiple state-of-the-art mechanisms, and estimate the likelihood that these conclusions will hold. We find that, for reasonable privacy regimes, state-of-the-art DP synthesizers are able to achieve high epistemic parity for several papers in our benchmark. However, some papers, and particularly some specific findings, are difficult to reproduce for any of the synthesizers. Given these results, we advocate for a new class of mechanisms that can reorder the priorities for DP data synthesis: favor stronger guarantees for utility (as measured by epistemic parity) and offer privacy protection with a focus on application-specific threat models and risk-assessment.

Equitable Data Valuation Meets the Right to Be Forgotten in Model Markets Haocheng Xia (Zhejiang University); Jinfei Liu (Zhejiang University)*; Jian Lou (Zhejiang University); Zhan Qin (Zhejiang University); Kui Ren (Zhejiang University); Yang Cao (Hokkaido University); Li Xiong (Emory University)
Show Abstract Download Paper

OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization Xiaochen Li (Zhejiang university)*; Yuke Hu (Zhejiang University); Weiran Liu (Alibaba Group); Hanwen Feng (Alibaba Group); Li Peng (Alibaba Group); Yuan Hong (University of Connecticut); Kui Ren (Zhejiang University); Zhan Qin (Zhejiang University)
Show Abstract Download Paper

Differential Privacy II

Chair: Yang Cao (Hokkaido University)

Longshot: Indexing Growing Databases using MPC and Differential Privacy Yanping Zhang (Duke University)*; Johes Bater (Tufts University); Kartik Nayak (Duke university); Ashwin Machanavajjhala (Duke)
Show Abstract Download Paper

Saibot: A Differentially Private Data Search Platform Zezhou Huang (Columbia University)*; Jiaxiang Liu (Columbia University); Daniel Alabi (Columbia University); Raul Castro Fernandez (The University of Chicago); Eugene Wu (Columbia University)
Show Abstract Download Paper

DPXPlain: Privately Explaining Aggregate Query Answers Yuchao Tao (SNAP)*; Amir Gilad (The Hebrew University); Ashwin Machanavajjhala (Duke); Sudeepa Roy (Duke University, USA)
Show Abstract Download Paper

Cache Me If You Can: Accuracy-Aware Inference Engine for Differentially Private Data Exploration Miti Mazmudar (University of Waterloo)*; Thomas Humphries (University of Waterloo); Jiaxiang Liu (University of Waterloo); Matthew Rafuse (University of Waterloo); Xi He (University of Waterloo)
Show Abstract Download Paper

Multi-Analyst Differential Privacy for Online Query Answering David A Pujol (Duke University)*; Albert Sun (Duke University); Brandon T Fain (Duke University); Ashwin Machanavajjhala (Duke)
Show Abstract Download Paper

Differential Privacy III

Chair: Johes Bater (Tufts University)

Benchmarking the Utility of 𝑤-event Differential Privacy Mechanisms - When Baselines Become Mighty Competitors [eab] Christine Schäler (Karlsruhe Institute of Technology (KIT)); Thomas Hütter (University of Salzburg); Martin Schäler (University of Salzburg)*
Show Abstract Download Paper

LDPTrace: Locally Differentially Private Trajectory Synthesis Yuntao Du (Zhejiang University); Yujia Hu (Zhejiang University); Zhikun Zhang (Stanford University); Ziquan Fang (Zhejiang University); Lu Chen (Zhejiang University); Baihua Zheng (Singapore Management University); Yunjun Gao (Zhejiang University)*
Show Abstract Download Paper

Trajectory Data Collection with Local Differential Privacy Yuemin Zhang (Harbin Engineering University); Qingqing Ye (Hong Kong Polytechnic University); Rui Chen (Harbin Engineering University)*; Haibo Hu (Hong Kong Polytechnic University); Qilong Han (Harbin Engineering University)
Show Abstract Download Paper

On the Risks of Collecting Multidimensional Data Under Local Differential Privacy Héber H. Arcolezi (Inria and École Polytechnique (IPP))*; Sébastien Gambs (UQAM); Jean-François Couchot (University of Franche-Comté); Catuscia Palamidessi (Laboratoire d'informatique de l'École polytechnique)
Show Abstract Download Paper

PreFair: Privately Generating Justifiably Fair Synthetic Data David A Pujol (Duke University)*; Amir Gilad (The Hebrew University); Ashwin Machanavajjhala (Duke)
Show Abstract Download Paper

R10

Temporal and Evolving Graphs

Chair: Udayan Khurana (IBM Research)

Auxo: A Scalable and Efficient Graph Stream Summarization Structure Zhiguo Jiang (Huazhong University of Science and Tecnology); Hanhua Chen (Huazhong University of Science and Technology)*; Hai Jin (Huazhong University of Science and Technology)
Show Abstract Download Paper

Scalable Time-Range k-Core Query on Temporal Graphs Junyong Yang (Wuhan University); Ming Zhong (Wuhan University)*; Yuanyuan Zhu (Wuhan University); Tieyun Qian (Wuhan University); Mengchi Liu (South China Normal University); Jeffrey Xu Yu (Chinese University of Hong Kong)
Show Abstract Download Paper

Anonymous Edge Representation for Inductive Anomaly Detection in Dynamic Bipartite Graphs Lanting Fang (Southeast University)*; Kaiyu Feng (Beijing Institute of Technology); Jie Gui (Southeast University); Shanshan Feng (Centre for Frontier AI Research, A*STAR); Aiqun Hu (Southeast University)
Show Abstract Download Paper

Spade: A Real-Time Fraud Detection Framework on Evolving Graphs [sds] Jiaxin Jiang (National University of Singapore)*; Yuan Li (National University of Singapore); Bingsheng He (National University of Singapore); Bryan Hooi (National University of Singapore); Jia Chen (Grab); Johan Kok Zhi Kang (Grab)
Show Abstract Download Paper

Mining Bursting Core in Large Temporal Graph Hongchao Qin (Beijing Institute of Technology); Rong-Hua Li (Beijing Institute of Technology); Ye Yuan (Beijing Institute of Technology); Guoren Wang (Beijing Institute of Technology); Lu Qin (UTS); Zhiwei Zhang (Hong Kong Baptist University)
Show Abstract Download Paper

R11

Graph Structures and Queries

Chair: Shuhao Liu (Shenzhen Institute of Computing Sciences)

Approximating Probabilistic Group Steiner Trees in Graphs Shuang Yang (Renmin University of China); Yahui Sun (Renmin University of China)*; Jiesong Liu (Renmin University of China); Xiaokui Xiao (National University of Singapore); Rong-Hua Li (Beijing Institute of Technology); Zhewei Wei (Renmin University of China)
Show Abstract Download Paper

Lotan: Bridging the Gap between GNNs and Scalable Graph Analytics Engines Yuhao Zhang (University of California at San Diego)*; Arun Kumar (University of California at San Diego)
Show Abstract Download Paper

Recent advances in Graph Neural Networks (GNNs) have changed the landscape of modern graph analytics. The complexity of GNN training and the challenges of GNN scalability has also sparked interest from the systems community, with efforts to build systems that provide higher efficiency and schemes to reduce costs. However, we observe that many such systems basically "reinvent the wheel" of much work done in the database world on scalable graph analytics engines. Further, they often tightly couple the scalability treatments of graph data processing with that of GNN training, resulting in entangled complex problems and systems that often do not scale well on one of those axes. In this paper, we ask a question: How far can we push existing systems for scalable graph analytics and deep learning (DL) instead of building custom GNN systems? Are compromises inevitable on scalability and/or runtimes? We propose Lotan, the first scalable and optimized data system for full-batch GNN training with decoupled scaling that bridges the hitherto siloed worlds of graph analytics systems and DL systems. Lotan offers a series of technical innovations, including re-imagining GNN training as query plan-like dataflows, execution plan rewriting, optimized data movement between systems, a GNN-centric graph partitioning scheme, and the first known GNN model batching scheme. We prototyped Lotan on top of GraphX and PyTorch. An empirical evaluation using several real-world benchmark GNN workloads reveals a promising nuanced picture: Lotan significantly surpasses the scalability of state-of-the-art custom GNN systems, while often matching or being only slightly behind on time-to-accuracy metrics in some cases. We also show the impact of our system optimizations. Overall, our work shows that the GNN world can indeed benefit from building on top of scalable graph analytics engines. Lotan's new level of scalability can also empower new ML-oriented research on ever-larger graphs and GNNs.

Discovering Polarization Niches via Dense Subgraphs with Attractors and Repulsers Adriano Fazzone (Sapienza University of Rome); Tommaso Lanciano (KTH Royal Institute of Technology); Riccardo Denni (Sapienza University); Charalampos Tsourakakis (Boston University); Francesco Bonchi (ISI Foundation, Turin)
Show Abstract Download Paper

gCore: Exploring Cross-layer Cohesiveness in Multi-layer Graphs Dandan Liu (Harbin Institute of Technology); Zhaonian Zou (Harbin Institute of Technology)*
Show Abstract Download Paper

Zebra: When Temporal Graph Neural Networks Meet Temporal Personalized PageRank Yiming Li (Hong Kong University of Science and Technology)*; Yanyan Shen (Shanghai Jiao Tong University); Lei Chen (Hong Kong University of Science and Technology); Mingxuan Yuan (Huawei)
Show Abstract Download Paper

R12

Reasoning, Recommendation, Classification

Chair: Besat Kassaie (University of Waterloo)

Federated Calibration and Evaluation of Binary Classifiers Graham Cormode (University of Warwick)*; Igor L Markov (Meta)
Show Abstract Download Paper

Efficient Fault Tolerance for Recommendation Model Training via Erasure Coding Tianyu Zhang (Carnegie Mellon University); Kaige Liu (Carnegie Mellon University); Jack Kosaian (Carnegie Mellon University)*; Juncheng Yang (Carnegie Mellon University); Rashmi Vinayak (Carnegie Mellon Univerity)
Show Abstract Download Paper

Scalable Reasoning on Document Stores via Instance-Aware Query Rewriting Olivier Rodriguez (INRIA); Federico Ulliana (Inria)*; Marie-Laure Mugnier (University of Montpellier)
Show Abstract Download Paper

Happiness Maximizing Sets under Group Fairness Constraints Jiping Zheng (Nanjing University of Aeronautics and Astronautics); Yuan Ma (Nanjing University of Aeronautics and Astronautics); Wei Ma (Nanjing University of Aeronautics and Astronautics); Yanhao Wang (East China Normal University)*; Xiaoyang Wang (University of New South Wales)
Show Abstract Download Paper

Fairness Matters: A Tit-For-Tat Strategy Against Selfish Mining Weijie Sun (The Hong Kong University of Science and Technology); Zihuan Xu (Hong Kong University of Science and Technology); Lei Chen (Hong Kong University of Science and Technology)
Show Abstract Download Paper

R13

Queries and Systems I

Chair: Sabina Petride (Oracle)

C5: Cloned Concurrency Control that Always Keeps Up Jeffrey Helt (Princeton University)*; Abhinav Sharma (Meta Platforms); Daniel J Abadi (UMD); Wyatt Lloyd (Princeton University); Jose Faleiro (Microsoft)
Show Abstract Download Paper

LEON: A New Framework for ML-Aided Query Optimization Xu Chen (University of Electronic Science and Technology of China)*; Haitian Chen (University of Electronic Science and Technology of China); Zibo Liang (University of Electronic Science and Technology of China); Shuncheng Liu (University of Electronic Science and Technology of China); Jinghong Wang (Huawei Technologies Co., Ltd.); Kai Zeng (Huawei); Han Su (University of Electronic Science and Technology of China); Kai Zheng (University of Electronic Science and Technology of China)
Show Abstract Download Paper

BASE: Bridging the Gap between Cost and Latency for Query Optimization [sds] Xu Chen (University of Electronic Science and Technology of China)*; Zhen Wang (Alibaba Group); Shuncheng Liu (University of Electronic Science and Technology of China); Yaliang Li (Alibaba Group); Kai Zeng (University Of Electronic Science And Technology Of China); Bolin Ding (Data Analytics and Intelligence Lab, Alibaba Group); Jingren Zhou (Alibaba Group); Han Su (University of Electronic Science and Technology of China); Kai Zheng (University of Electronic Science and Technology of China)
Show Abstract Download Paper

A Randomized Blocking Structure for Streaming Record Linkage [sds] Dimitrios Karapiperis (International Hellenic University)*; Christos Tjortjis (International Hellenic University); Vassilios S. Verykios (Hellenic Open University)
Show Abstract Download Paper

A Case for Graphics-driven Query Processing Harish Doraiswamy (Microsoft Research India)*; Vikas Kalagi (Microsoft Research India); Karthik Ramachandra (Microsoft Azure SQL India); Jayant R Haritsa (Indian Institute of Science)
Show Abstract Download Paper

R14

Spatial, Spatio-Temporal

Chair: Felix Naumann (Hasso Plattner Institute)

A Hierarchical Grouping Algorithm for the Multi-Vehicle Dial-a-Ride Problem Kelin Luo (University of Bonn); Alexandre M Florio (Polytechnique Montreal); Syamantak Das (IIIT Delhi); Xiangyu Guo (University at Buffalo)*
Show Abstract Download Paper

Route Travel Time Estimation on A Road Network Revisited: Heterogeneity, Proximity, Periodicity and Dynamicity Haitao Yuan (Nanyang Technological University)*; Guoliang Li (Tsinghua University); Zhifeng Bao (RMIT University)
Show Abstract Download Paper

Automatic Road Extraction with Multi-Source Data Revisited: Completeness, Smoothness and Discrimination Haitao Yuan (Nanyang Technological University)*; Sai Wang (Wuhan University); Zhifeng Bao (RMIT University); Shangguang Wang (State Key Laboratory of Networking and Switching Technology)
Show Abstract Download Paper

Budget-Conscious Fine-Grained Configuration Optimization for Spatio-Temporal Applications Keven Richly (Hasso Plattner Institute); Rainer Schlosser (Hasso Plattner Institute); Martin Boissier (Hasso Plattner Institute)
Show Abstract Download Paper

Real-time Workload Pattern Analysis for Large-scale Cloud Databases [industry] Jiaqi Wang (Zhejiang University); Tianyi Li (Aalborg University); Anni Wang (Alibaba); Xiaoze Liu (Purdue University); Lu Chen (Zhejiang University)*; Jie Chen (Alibaba); Jianye Liu (Alibaba Group); Junyang Wu (Zhejiang University); Feifei Li (Alibaba Group); Yunjun Gao (Zhejiang University)
Show Abstract Download Paper

R15

Online Demos I

Chair: Besat Kassaie (University of Waterloo)

DoveDB: A Declarative and Low-Latency Video Database [demo] Ziyang Xiao (Zhejiang University); Dongxiang Zhang (Zhejiang University)*; Zepeng Li (Zhejiang University); Sai Wu (Zhejiang Univ); Kian-Lee Tan (National University of Singapore); Gang Chen (Zhejiang University)
Show Abstract Download Paper

FastMosaic in Action: A New Mosaic Operator for Array DBMSs [demo] Ramon Antonio Rodriges Zalipynis (HSE University)*
Show Abstract Download Paper

CORNET: Learning Spreadsheet Formatting Rules By Example [demo] Mukul Singh (Microsoft)*; José Cambronero Sánchez (Microsoft); Sumit Gulwani (Microsoft Research); Vu Le (Microsoft); Carina Negreanu (Microsoft Research); Gust Verbruggen (Microsoft)
Show Abstract Download Paper

ADOps: An Anomaly Detection Pipeline in Structured Logs [demo] Xintong Song (Netease Fuxi AI Lab)*; Yusen Zhu (NetEase Fuxi AI Lab); Jianfei Wu (Netease Fuxi AI Lab); Bai Liu (Netease Fuxi AI Lab); Hongkang Wei (Netease Fuxi AI Lab)
Show Abstract Download Paper

A Demonstration of DLBD: Database Logic Bug Detection System [demo] Xiu Tang (Zhejiang University); Sai Wu (Zhejiang Univ)*; Dongxiang Zhang (Zhejiang University); Ziyue Wang (Zhejiang University); Gongsheng Yuan (Zhejiang University); Gang Chen (Zhejiang University)
Show Abstract Download Paper

DHive: Query Execution Performance Analysis via Dataflow in Apache Hive [demo] Chaozu Zhang (Southern University of Science and Technology)*; Qiaomu Shen (Southern University of Science and Technology); Bo Tang (Southern University of Science and Technology)
Show Abstract Download Paper

Lingua Manga: A Generic Large Language Model Centric System for Data Curation [demo] Zui Chen (Tsinghua University)*; Lei Cao (University of Arizona/MIT); Samuel Madden (Massachusetts Institute of Technology)
Show Abstract Download Paper

Interactive Demonstration of EVA [demo] Gaurav Tarlok Kakkar (Georgia Institute of Technology)*; Aryan Rajoria (Georgia Institute of Technology); Myna Prasanna Kalluraya (Georgia Institute of Technology); Ashmita Raju (Georgia Institute of Technology); Jiashen Cao (Georgia Tech); Kexin Rong (Georgia Institute of Technology); Joy Arulraj (Georgia Tech)
Show Abstract Download Paper

R20

Indexing and Learned Indexing

Chair: Zheng Wang (Huawei Singapore Research Center)

PLIN: A Persistent Learned Index for Non-Volatile Memory with High Performance and Instant Recovery Zhou Zhang (USTC); Zhaole Chu (University of Science and Technology of China); Peiquan Jin (University of Science and Technology of China)*; Yongping Luo (University of Science and Technology of China); Xike Xie (University of Science and Technology of China); Shouhong Wan (Univerisity of Science and Technology of China); Yun Luo (Tencent); Xufei Wu (Tencent); Peng Zou (Tencent); Chunyang Zheng (Intel); Guoan Wu (Intel); Andy Rudoff (Intel)
Show Abstract Download Paper

Sieve: A Learned Data-Skipping Index for Data Analytics Yulai Tong (Huazhong University of Science and Technology)*; Jiazhen Liu (Huazhong University of science and technology); Hua Wang (Huazhong University of Science and Technology); Ke Zhou (Huazhong University of Science and Technology); Rongfeng He (Huawei Cloud Computing Technologies Co., Ltd); Qin Zhang (Huawei Cloud); Cheng Wang (Huawei Cloud Computing Technologies Co., Ltd)
Show Abstract Download Paper

LMSFC: A Novel Multidimensional Index based on Learned Monotonic Space Filling Curves Jian Gao (University of New South Wales)*; Xin Cao (University of New South Wales); Xin Yao (Huawei Theory Lab); Gong Zhang (Huawei); Wei Wang (Hong Kong University of Science and Technology (Guangzhou))
Show Abstract Download Paper

FILM: a Fully Learned Index for Larger-than-Memory Databases Chaohong Ma (Renmin University of China)*; Xiaohui Yu (York University); Yifan Li (York University); Xiaofeng Meng (Renmin University of China); Aishan Maoliniyazi (Renmin University)
Show Abstract Download Paper

Learned Index: A Comprehensive Experimental Evaluation [eab] Zhaoyan Sun (Tsinghua University); Xuanhe Zhou (Tsinghua University); Guoliang Li (Tsinghua University)*
Show Abstract Download Paper

R21

Learning and Systems

Chair: Yuxin Tang (Rice University)

CORNET: Learning Table Formatting Rules By Example Mukul Singh (Microsoft)*; José Cambronero Sánchez (Microsoft); Sumit Gulwani (Microsoft Research); Vu Le (Microsoft); Carina Negreanu (Microsoft Research); Mohammad Raza (Microsoft); Gust Verbruggen (Microsoft)
Show Abstract Download Paper

Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures Yongji Wu (Duke University)*; Matthew Lentz (Duke University); Danyang Zhuo (Duke University); Yao Lu (Microsoft Research)
Show Abstract Download Paper

LOGER: A Learned Optimizer towards Generating Efficient and Robust Query Execution Plans Tianyi Chen (Key Laboratory of High Confidence Software Technologies, CS, Peking University); Jun Gao (Peking University)*; Hedui Chen (ZTE Corporation); Yaofeng Tu (ZTE Corporation)
Show Abstract Download Paper

Falcon: A Privacy-Preserving and Interpretable Vertical Federated Learning System Yuncheng Wu (National University of Singapore); Naili Xing (national university of singapore); Gang Chen (Zhejiang University); Tien Tuan Anh Dinh (Deakin University); Zhaojing Luo (National University of Singapore); Beng Chin Ooi (NUS)*; Xiaokui Xiao (National University of Singapore); Meihui Zhang (Beijing Institute of Technology)
Show Abstract Download Paper

Cost-Based or Learning-Based? A Hybrid Query Optimizer for Query Plan Selection Xiang Yu (Tsinghua University); Chengliang Chai (Beijing Institute of Technology); Guoliang Li (Tsinghua University); Jiabin Liu (Tsinghua University)
Show Abstract Download Paper

R22

Parallelization and Analytics

Chair: Elisa Bertino (Purdue University)

SyncSignature: A Simple, Efficient, Parallelizable Framework for Tree Similarity Joins Nikolai Karpov (Indiana University Bloomington); Qin Zhang (Indiana University Bloomington)*
Show Abstract Download Paper

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism [sds] Xupeng Miao (Carnegie Mellon University)*; Yujie Wang (Peking University); Youhe Jiang (Peking University); Chunan Shi (Peking University); Xiaonan Nie (Peking University); Hailin Zhang (Peking University); Bin Cui (Peking University)
Show Abstract Download Paper

SDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel Training [sds] Xupeng Miao (Carnegie Mellon University)*; Yining Shi (Peking University); Zhi Yang (Peking University); Bin Cui (Peking University); Zhihao Jia (Carnegie Mellon University)
Show Abstract Download Paper

FLARE: A Fast, Secure, and Memory-Efficient Distributed Analytics Framework Xiang Li (Tsinghua University)*; Fabing Li (Xi'an Jiaotong University); Mingyu Gao (Tsinghua University)
Show Abstract Download Paper

SODA: A Set of Fast Oblivious Algorithms in Distributed Secure Data Analytics Xiang Li (Tsinghua University)*; Nuozhou Sun (Tsinghua University); Yunqian Luo (Tsinghua University); Mingyu Gao (Tsinghua University)
Show Abstract Download Paper

R23

Trust, Security, Verifiability

Chair: Dimitrios Melissourgos (Grand Valley State University)

Frequency-revealing attacks against Frequency-hiding Order-preserving Encryption Xinle Cao (Zhejiang University); Jian Liu (Zhejiang University)*; Yongsheng Shen (Hang Zhou City Brain Co., Ltd); Xiaohua Ye (Hang Zhou City Brain Co., Ltd); Kui Ren (Zhejiang University)
Show Abstract Download Paper

Range Search over Encrypted Multi-Attribute Data Francesca Falzon (Brown University)*; Evangelia Anna Markatou (Brown University); Zachary T Espiritu (Brown University); Roberto Tamassia (Brown University)
Show Abstract Download Paper

R24

Queries and Systems II

Chair: Goce Trajcevski (Iowa State University)

SEIDEN: Revisiting Query Processing in Video Database Systems Jaeho Bang (Georgia Institute of Technology); Gaurav Tarlok Kakkar (Georgia Institute of Technology)*; Pramod Chunduri (Georgia Institute of Technology); Subrata Mitra (Adobe Research); Joy Arulraj (Georgia Tech)
Show Abstract Download Paper

Efficient Black-box Checking of Snapshot Isolation in Databases Kaile Huang (Nanjing University); Si Liu (ETH Zurich); Zhenge Chen (Nanjing University); Hengfeng Wei (Nanjing University)*; David A Basin (ETH Zurich); Haixiang Li (Tencent, China); Anqun Pan (Tencent, China)
Show Abstract Download Paper

Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level Pu Pang (Shanghai Jiao Tong University); Gang Deng (Alibaba Cloud); Kaihao Bai (Shanghai Jiao Tong University); Quan Chen (Shanghai Jiao Tong University)*; Shixuan Sun (Shanghai Jiao Tong University); Bo Liu (Shanghai Jiao Tong University); Yu Xu (Alibaba Cloud); Hongbo Yao (Alibaba Cloud); Zhengheng Wang (Alibaba Group); Xiyu Wang (Alibaba Group); Zheng Liu (Alibaba Group); Zhuo Song (Alibaba Cloud); Yong Yang (Alibaba Cloud); Tao Ma (Alibaba Cloud); Minyi Guo (Shanghai Jiao Tong University)
Show Abstract Download Paper

PetPS: Supporting Huge Embedding Models with Persistent Memory [sds] Minhui Xie (Tsinghua University)*; Youyou Lu (luyouyou@tsinghua.edu.cn); Qing Wang (Tsinghua University); Yangyang Feng (Tsinghua University); Jiaqiang Liu (Kuaishou); Kai Ren (Kuaishou Technology); Jiwu Shu (shujw@tsinghua.edu.cn)
Show Abstract Download Paper

MagicScaler: Uncertainty-aware, Predictive Autoscaling [industry] Zhicheng Pan (East China Normal University); Yihang Wang (Alibaba Group); Yingying Zhang (Alibaba Group); Sean Bin Yang (Aalborg University); Yunyao Cheng (Aalborg University); Peng Chen (East China Normal University); Chenjuan Guo (ECNU); Qingsong Wen (Alibaba Group U.S.); Xiduo Tian (Alibaba Group); Yunliang Dou (Alibaba Group); Zhiqiang Zhou (Alibaba Damo Academy); Chengcheng Yang (East China Normal University); Aoying Zhou (East China Normal University); Bin Yang (East China Normal University)*
Show Abstract Download Paper

R25

Search and Aggregation

Chair: Xiang Lian (Kent State University)

Fast Approximate Denial Constraint Discovery Renjie Xiao (Fudan University); Zijing Tan (Fudan University)*; Haojin Wang (Fudan University); Shuai Ma (Beihang University)
Show Abstract Download Paper

CommunityAF: An Example-based Community Search Method via Autoregressive Flow Jiazun Chen (Peking university); Yikuan Xia (Peking University); Jun Gao (Peking University)*
Show Abstract Download Paper

Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent [industry] Xiaonan Nie (Peking University)*; Yi Liu (Tencent); Fangcheng Fu (Peking University); Jinbao Xue (Tencent); Dian Jiao (Tencent); Xupeng Miao (Carnegie Mellon University); Yangyu Tao (Tencent); Bin Cui (Peking University)
Show Abstract Download Paper

HEDA: Multi-Attribute Unbounded Aggregation over Homomorphically Encrypted Database Xuanle Ren (Alibaba Group); Le Su (Alibaba Group); Zhen Gu (Alibaba Group)*; Sheng Wang (Alibaba Group); Feifei Li (Alibaba Group); Yuan Xie (Alibaba DAMO Academy); Song Bian (Kyoto University); Chao Li (Zhejiang University); Fan Zhang (Zhejiang University)
Show Abstract Download Paper

LIDER: An Efficient High-dimensional Learned Index for Large-scale Dense Passage Retrieval Yifan Wang (University of Florida)*; Haodi Ma (University of Florida); Daisy Zhe Wang (Univeresity of Florida)
Show Abstract Download Paper

R30

Learning, Recommendations, Social Networks

Chair: Naheed Anjum Arafat (Nanyang Technological University, Singapore)

Influence Maximization in Real-World Closed Social Networks Shixun Huang (University of Wollongong); Wenqing Lin (Tencent); Zhifeng Bao (RMIT University)*; Jiachen Sun (TENCENT)
Show Abstract Download Paper

MultiBiSage: A Web-Scale Recommendation System Using Multiple Bipartite Graphs at Pinterest [sds] Saket Gurukar (The Ohio State University)*; Nikil Pancha (Pinterest); Andrew H Zhai (Pinterest); Eric Kim (Pinterest); Samson Hu (Pinterest); Srinivasan Parthasarathy (Ohio State University); Charles Rosenberg (Pinterest); Jure Leskovec (Stanford University)
Show Abstract Download Paper

Triangular Stability Maximization by Influence Spread over Social Networks Zheng Hu (Fudan University); Weiguo Zheng (Fudan University)*; Xiang Lian (Kent State University)
Show Abstract Download Paper

Coresets over Multiple Tables for Feature-rich and Data-efficient Machine Learning Jiayi Wang (Tsinghua University)*; Chengliang Chai (Beijing Institute of Technology); Nan Tang (Qatar Computing Research Institute, HBKU); Jiabin Liu (Tsinghua University); Guoliang Li (Tsinghua University)
Show Abstract Download Paper

Auto-Tuning with Reinforcement Learning for Permissioned Blockchain Systems Mingxuan Li (Institute of Information Engineering,Chinese Academy of Sciences)*; Yazhe Wang (State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences); Shuai Ma (Beihang University); Chao Liu (Institute of Information Engineering,Chinese Academy of Sciences); Dongdong Huo (Institute of Information Engineering,Chinese Academy of Sciences); Yu Wang (Institute of Information Engineering,Chinese Academy of Sciences); Zhen Xu (State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences)
Show Abstract Download Paper

R31

Transactions, Tuning, and Compression

Chair: Yu Xia (MIT)

Efficient Distributed Transaction Processing in Heterogeneous Networks Qian Zhang (RenMin University of China); Jingyao Li (Renmin University of China); Hongyao Zhao (Renmin University of China); Quanqing Xu (OceanBase); Wei Lu (Renmin University of China)*; Jinliang Xiao (OceanBase); Fusheng Han (OceanBase); Chuanhui Yang (OceanBase); Xiaoyong Du (Renmin University of China)
Show Abstract Download Paper

STARRY: Multi-master Transaction Processing on Semi-leader Architecture Zihao Zhang (East China Normal University); Huiqi Hu (East China Normal University)*; Xuan Zhou (East China Normal University); Jiang Wang (Huawei)
Show Abstract Download Paper

Adore: Differentially Oblivious Relational Database Operators Lianke Qin (UCSB)*; Rajesh Jayaram (Carnegie Mellon University); Elaine Shi (Carnegie Mellon University); Zhao Song (Adobe Research); Danyang Zhuo (Duke University); Shumo Chu ()
Show Abstract Download Paper

PromptEM: Prompt-tuning for Low-resource Generalized Entity Matching [sds] Pengfei Wang (Zhejiang University); Xiaocan Zeng (Zhejiang University); Lu Chen (Zhejiang University); Fan Ye (Zhejiang University); Yuren Mao (Zhejiang University); Junhao Zhu (Zhejiang University); Yunjun Gao (Zhejiang University)*
Show Abstract Download Paper

Elf: Erasing-based Lossless Floating-Point Compression Ruiyuan Li (Chongqing University)*; Zheng Li (Chongqing University); Yi Wu (Chongqing University); Chao Chen (Chongqing University); Yu Zheng (JD)
Show Abstract Download Paper

R32

Potpourri Online I (Systems & Algorithms)

Chair: Elisa Bertino (Purdue University)

TASK: An Efficient Framework for Instant Error-tolerant Spatial Keyword Queries on Road Networks Chengyang Luo (Zhejiang University); Qing Liu (Zhejiang University); Yunjun Gao (Zhejiang University)*; Lu Chen (Zhejiang University); Ziheng Wei (Huawei Technologies Co., Ltd.); Congcong Ge (Huawei Technologies Co., Ltd.)
Show Abstract Download Paper

Differentially Private Vertical Federated Clustering Zitao Li (Alibaba Group)*; Tianhao Wang (University of Virginia); Ninghui Li (Purdue University)
Show Abstract Download Paper

Frequency Domain Data Encoding in Apache IoTDB [sds] Haoyu Wang (Tsinghua University); Shaoxu Song (Tsinghua University)*
Show Abstract Download Paper

Change Propagation Without Joins Qichen Wang (Hong Kong Baptist University); Xiao Hu (University of Waterloo)*; Binyang Dai (Hong Kong University of Science and Technology); Ke Yi (Hong Kong University of Science and Technology)
Show Abstract Download Paper

BICE: Exploring Compact Search Space by Using Bipartite Matching and Cell-Wide Verification Yunyoung Choi (Alsemy); Kunsoo Park (Seoul National University); Hyunjoon Kim (Hanyang University)*
Show Abstract Download Paper

R33

Systems in Industry

Chair: Qizhen Zhang (University of Toronto)

Big Data Analytic Toolkit: A general-purpose, modular, and heterogeneous acceleration toolkit for data analytical engines [industry] Jiang Li (Intel Corporation)*; Qi Xie (Intel Corporation); Yan Ma (Intel Corporation); Jian Ma (Intel Corporation); Kunshang Ji (Intel Corporation); Yizhong Zhang (Intel Corporation); Chaojun Zhang (Intel Corporation); Yixiu Chen (Intel Corporation); Gangsheng Wu (Intel Corporation); Jie Zhang (Intel Corporation); Kaidi Yang (Intel Corporation); Xinyi He (Intel Corporation); Qiuyang Shen (Intel Corporation); Yanting Tao (Intel Corporation); Haiwei Zhao (Intel Corporation); Penghui Jiao (Intel Corporation); Chengfei Zhu (Intel Corporation); David Qian (Intel Corporation); Cheng Xu (Intel Corporation)
Show Abstract Download Paper

Towards General and Efficient Online Tuning for Spark [industry] Yang Li (Tencent)*; Huaijun Jiang (Peking University); Yu Shen (Peking University); Yide Fang (Tencent); Xiaofeng Yang (Tencent); Danqing Huang (Tencent); Xinyi Zhang (Peking University); Wentao Zhang (Peking University); Ce Zhang (ETH); Peng Chen (Tencent); Bin Cui (Peking University)
Show Abstract Download Paper

SimpleTS: An Efficient and Universal Model Selection Framework for Time Series Forecasting [industry] Yuanyuan Yao (Zhejiang University); Dimeng Li (Alibaba Group); Hailiang Jie (Zhejiang University); Lu Chen (Zhejiang University)*; Tianyi Li (Aalborg University); Jie Chen (Alibaba); Jiaqi Wang (Zhejiang University); Feifei Li (Alibaba Group); Yunjun Gao (Zhejiang University)
Show Abstract Download Paper

FEBench: A Benchmark for Real-Time Relational Data Feature Extraction [industry] Xuanhe Zhou (Tsinghua University); Cheng Chen (4Paradigm); Kunyi Li (Tsinghua University); Bingsheng He (National University of Singapore); Mian Lu (4Paradigm)*; Qiaosheng Liu (4Paradigm); Wei Huang (4Paradigm); Guoliang Li (Tsinghua University); Zhao Zheng (4Paradigm); Yuqiang Chen (4Paradigm)
Show Abstract Download Paper

R34

Potpourri Online II (Learning & Mining)

Chair: Laks V.S. Lakshmanan (University of British Columbia)

Self-Training for Label-Efficient Information Extraction from Semi-Structured Web-Pages Ritesh Sarkhel (Ohio State University)*; Binxuan Huang (Amazon); Colin Lockard (Amazon); Prashant Shiralkar (Amazon)
Show Abstract Download Paper

Self-supervised and Interpretable Data Cleaning with Sequence Generative Adversarial Networks Jinfeng Peng (Northeastern University)*; Derong Shen (Northeastern University); Nan Tang (Qatar Computing Research Institute, HBKU); Tieying Liu (Northeastern University); Yue Kou (Northeastern University); Tiezheng Nie (Northeastern University); Hang Cui (University of Illinois at Urbana-Champaign); Ge Yu (Northeastern University)
Show Abstract Download Paper

FirmTruss Community Search in Multilayer Networks Ali Behrouz (Cornell University)*; Farnoosh Hashemi (The University of British Columbia); Laks V.S. Lakshmanan (The University of British Columbia)
Show Abstract Download Paper

Efficient Triangle-Connected Truss Community Search In Dynamic Graphs Tianyang Xu (Wuhan University); Zhao Lu (Wuhan University); Yuanyuan Zhu (Wuhan University)*
Show Abstract Download Paper

R35

Online Demos II

Chair: Goce Trajcevski (Iowa State University)

Lynx: A Graph Query Framework for Multiple Heterogeneous Data Sources [demo] Zhihong Shen (Chinese Academy of Sciences, Computer Network Information Center); Hu Chuan (UCAS, CAS, CNIC)*; Zihao Zhao (CNIC,CAS,UCAS)
Show Abstract Download Paper

ChainDash: An Ad-Hoc Blockchain Data Analytics System [demo] Yushi Liu (East China Normal University); Liwei Yuan (Blockchain Platform Division, Ant Group); Zhihao Chen (East China Normal University); Yekai Yu (East China Normal University); Zhao Zhang (East China Normal University)*; Cheqing Jin (East China Normal University); Ying Yan (Ant Group)
Show Abstract Download Paper

CEDA: Learned Cardinality Estimation with Domain Adaptation [demo] Zilong Wang (Beijing Jiaotong University); Qixiong Zeng (School of Computer and Information Technology, Beijing Jiaotong University); Ning Wang (School of Computer and Information Technology, Beijing Jiaotong University)*; Haowen Lu (Beijing Jiaotong University); Yue Zhang (Beijing Jiaotong University)
Show Abstract Download Paper

Sniffer: A Novel Model Type Detection System against Machine-Learning-as-a-Service Platforms [demo] Zhuo Ma (Xidian University); Yilong Yang (Xidian University); Bin Xiao (Chongqing University of Posts and Telecommunications)*; Yang Liu (Xidian University); Xinjing Liu (Xidian University); Zhuoran Ma (Xidian University); Tong Yang (Peking University)
Show Abstract Download Paper

TsQuality: Measuring Time Series Data Quality in Apache IoTDB [demo] Yuanhui Qiu (Tsinghua University); Chenguang Fang (Tsinghua University); Shaoxu Song (Tsinghua University)*; Xiangdong Huang (Tsinghua University); Chen Wang (Timecho Limited); Jianmin Wang (Tsinghua University, China)
Show Abstract Download Paper

A Learned Query Rewrite System [demo] Xuanhe Zhou (Tsinghua University); Guoliang Li (Tsinghua University)*; Jianming Wu (Tsinghua University); Jiesi Liu (Tsinghua University); Zhaoyan Sun (Tsinghua University); Xinning Zhang (Tsinghua University)
Show Abstract Download Paper

AQUA: Automatic Collaborative Query Processing in Analytical Database [demo] Yuchen Peng (Zhejiang University); Ke Chen (Zhejiang University)*; Lidan Shou (Zhejiang University); Dawei Jiang (Zhejiang University); Gang Chen (Zhejiang University)
Show Abstract Download Paper

Fanglue: An Interactive System for Decision Rule Crafting [demo] Chen Qian (Ant Group); Shiwei Liang (Ant Group); Zhaoyang Wang (Ant Group); Yin Lou (Ant Group)*
Show Abstract Download Paper

RESCU-SQL: Oblivious Querying for the Zero Trust Cloud [demo] Xiling Li (Northwestern University)*; Gefei Tan (Northwestern University); Xiao Wang (Northwestern University); Jennie Rogers (Northwestern University); Soamar Homsi (Air Force Research Laboratory)
Show Abstract Download Paper

Scalable ML III

Chair: Aida Sheshbolouki (University of Waterloo)

Nemo: Guiding and Contextualizing Weak Supervision for Interactive Data Programming Cheng-Yu Hsieh (University of Washington); Jieyu Zhang (University of Washington); Alexander Ratner (University of Washington)
Show Abstract Download Paper

Collective Grounding: Applying Database Techniques to Grounding Templated Models Eriq Augustine (UCSC)*; Lise Getoor (University of California Santa Cruz)
Show Abstract Download Paper

On Efficient Approximate Queries over Machine Learning Models Dujian Ding (University of British Columbia)*; Sihem Amer-Yahia (CNRS); Laks V.S. Lakshmanan (The University of British Columbia)
Show Abstract Download Paper

Accelerating Aggregation Queries on Unstructured Streams of Data Matthew D Russo (Stanford University)*; Tatsunori Hashimoto (Stanford); Daniel Kang (UIUC); Yi Sun (University of Chicago); Matei Zaharia (Berkeley and Databricks)
Show Abstract Download Paper

SIFTER: Space-Efficient Value Iteration for Finite-Horizon MDPs [sds] Konstantinos Skitsas (Aarhus University); Ioannis G Papageorgiou (NTUA); Mohammad Sadegh Talebi (University of Copenhagen); Verena Kantere (NTUA); Michael Katehakis (Rutgers University); Panagiotis Karras (Aarhus University)*
Show Abstract Download Paper

Scalable ML IV

Chair: Amir Shaikhha (University of Edinburgh)

Marigold: Efficient k-means Clustering in High Dimensions [sds] Kasper Overgaard Mortensen (Aarhus University); Fatemeh Zardbani (Aarhus University); Mohammad Ahsanul Haque (AAU); Steinn Ymir Agustsson (Aarhus University); Davide Mottin (Aarhus University); Philip Hofmann (Aarhus University); Panagiotis Karras (Aarhus University)*
Show Abstract Download Paper

Fast Search-By-Classification for Large-Scale Databases Using Index-Aware Decision Trees and Random Forests Christian Lülf (University of Münster)*; Denis Mayr Lima Martins (University of Münster); Marcos Antonio Vaz Salles (Independent Researcher); Yongluan Zhou (University of Copenhagen); Fabian Gieseke (University of Münster)
Show Abstract Download Paper

Similarity search in the blink of an eye with compressed indices Cecilia Aguerrebere (Intel Labs)*; Ishwar Singh Bhati (Intel); Mark Hildebrand (Intel Corporation); Mariano Tepper (Intel Labs); Theodore L Willke (Intel Labs)
Show Abstract Download Paper

SHiFT: An Efficient, Flexible Search Engine for Transfer Learning Cedric Renggli (UZH)*; Xiaozhe Yao (ETH Zurich); Luka Kolar (ETH Zurich); Luka Rimanic (ETH Zurich); Ana Klimovic (ETH Zurich); Ce Zhang (ETH)
Show Abstract Download Paper

TOD: GPU-accelerated Outlier Detection via Tensor Operations Yue Zhao (University of Southern California)*; George H Chen (Carnegie Mellon University); Zhihao Jia (Carnegie Mellon University)
Show Abstract Download Paper

Learned Indexes & Query Processing/Optimization I

Chair: Ibrahim Sabek (University of Southern California)

DILI: A Distribution-Driven Learned Index Pengfei Li (Alibaba Group)*; Hua Lu (Roskilde University); Rong Zhu (Alibaba Group); Bolin Ding (Data Analytics and Intelligence Lab, Alibaba Group); Long Yang (Peking University); Gang Pan (Zhejiang University)
Show Abstract Download Paper

FASTgres: Making Learned Query Optimizer Hinting Effective Lucas Woltmann (Technische Universität Dresden); Jerome Thiessat (TU Dresden); Claudio Hartmann (Technische Universität Dresden); Dirk Habich (TU Dresden)*; Wolfgang Lehner (TU Dresden)
Show Abstract Download Paper

Lero: A Learning-to-Rank Query Optimizer Rong Zhu (Alibaba Group)*; Wei Chen (Alibaba); Bolin Ding (Data Analytics and Intelligence Lab, Alibaba Group); Xingguang Chen (The Chinese University of Hong Kong); Andreas Pfadler (Alibaba Group); Ziniu Wu (Massachusetts Institute of Technology); Jingren Zhou (Alibaba Group)
Show Abstract Download Paper

Learned Index Benefits: Machine Learning Based Index Performance Estimation Jiachen Shi (Nanyang Technological University); Gao Cong (Nanyang Technological Univesity); Xiaoli Li (Institute for Infocomm Research, A*STAR, Singapore/Nanyang Technological University)
Show Abstract Download Paper

Learned Indexes & Query Processing/Optimization II

Chair: Kurt Stockinger (University of Zurich)

The Case for Learned In-Memory Joins [eab] Ibrahim Sabek (Massachusetts Institute of Technology)*; Tim Kraska (Massachusetts Institute of Technology)
Show Abstract Download Paper

ADOPT: Adaptively Optimizing Attribute Orders for Worst-Case Optimal Join Algorithms via Reinforcement Learning Junxiong Wang (Cornell University)*; Immanuel Trummer (Cornell); Ahmet Kara (University of Zurich); Dan Olteanu (University of Zurich)
Show Abstract Download Paper

Simple Adaptive Query Processing vs. Learned Query Optimizers: Observations and Analysis [eab] Yunjia Zhang (University of Wisconsin-Madison)*; Yannis Chronis (University of Wisconsin Madison); Jignesh M. Patel (Carnegie Mellon University); Theodoros Rekatsinas (ETH Zurich)
Show Abstract Download Paper

Can Learned Models Replace Hash Functions? [eab] Ibrahim Sabek (Massachusetts Institute of Technology)*; Kapil Vaidya (Massachusetts Institute of Technology); Dominik Horn (Technical University of Munich (TUM)); Andreas Kipf (Amazon Web Services); Michael Mitzenmacher (Harvard); Tim Kraska (Massachusetts Institute of Technology)
Show Abstract Download Paper

SkinnerMT: Parallelizing for Efficiency and Robustness in Adaptive Query Processing on Multicore Platforms Ziyun Wei (Cornell University)*; Immanuel Trummer (Cornell)
Show Abstract Download Paper

Tutorial-1

Private Information Retrieval in Large Scale Public Data Repositories [tutorial] Ishtiyaque Ahmad (University of California at Santa Barbara)*; Divyakant Agrawal (University of California at Santa Barbara); Amr El Abbadi (UC Santa Barbara); Trinabh Gupta (UCSB)
Show Abstract Download Paper

Tutorial-2

Databases on Modern Networks: A Decade of Research That Now Comes into Practice [tutorial] Alberto Lerner (University of Fribourg)*; Carsten Binnig (TU Darmstadt); Philippe Cudré-Mauroux (Exascale Infolab, Fribourg University); Rana Hussein (University of Fribourg); Matthias Jasny (TU Darmstadt); Theo Jepsen (USI); Dan Ports (MSR); Lasse Thostrup (TU Darmstadt); Tobias Ziegler (TU Darmstadt)
Show Abstract Download Paper

Tutorial-3

Full-Power Graph Querying: State of the Art and Challenges [tutorial] Ioana Manolescu (Inria and Institut Polytechnique de Paris); Madhulika Mohanty (Inria Saclay)*
Show Abstract Download Paper

Tutorial-4

Efficient Execution of User-Defined Functions in SQL Queries [tutorial] Alkis Simitsis (Athena Research Center)*; Yannis E Foufoulas (University of Athens)
Show Abstract Download Paper

Tutorial-5

Data and AI Model Markets: Opportunities for Data and Model Sharing, Discovery, and Integration [tutorial] Jian Pei (Simon Fraser University)*; Raul Castro Fernandez (The University of Chicago); Xiaohui Yu (York University)
Show Abstract Download Paper

Tutorial-6

Machine Learning for Subgraph Extraction: Methods, Applications and Challenges [tutorial] Kai Siong Yow (Nanyang Technological University)*; Ningyi Liao (Nanyang Technological University); Siqiang Luo (Nanyang Technological University); Reynold Cheng (The University of Hong Kong, China)
Show Abstract Download Paper

Tutorial-7

Building a Collaborative Data Analytics System: Opportunities and Challenges [tutorial] Zuozhi Wang (U C IRVINE)*; Chen Li (UC Irvine)
Show Abstract Download Paper

Tutorial-9

Natural Language Interfaces for Databases with Deep Learning [tutorial] George Katsogiannis-Meimarakis (Athena Research Center)*; Mike Xydas (Athena R.C.); Georgia Koutrika (ATHENA Research Center)
Show Abstract Download Paper

Tutorial-10

Time Series Data Mining: A Unifying View [tutorial] Eamonn Keogh (UC Riverside)*
Show Abstract Download Paper

Tutorial-11

A Tutorial on Visual Representations of Relational Queries [tutorial] Wolfgang Gatterbauer (Northeastern University)*
Show Abstract Download Paper

Demo-Group-A

PSFQ: A Blockchain-based Privacy-preserving and Verifiable Student Feedback Questionnaire Platform [demo] Wangze Ni (Hong Kong University of Science and Technology); Pengze Chen (Hong Kong University of Science and Technology); Lei Chen (Hong Kong University of Science and Technology)*
Show Abstract Download Paper

Showcasing Data Management Challenges for Future IoT Applications with NebulaStream [demo] Aljoscha P Lepping (TU Berlin)*; Hoang Mi Pham (Technische Universität Berlin); Laura Mons (DIMA); Balint Rueb (TU Berlin); Ankit Chaudhary (Technische Universität Berlin); Philipp M Grulich (Technische Universität Berlin); Steffen Zeuch (Technische Universität Berlin); Volker Markl (Technische Universität Berlin)
Show Abstract Download Paper

KGNav: A Knowledge Graph Navigational Visual Query System [demo] Xiang Wang (Tianjin University); Xin Wang (Tianjin University)*; Zhaozhuo Li (Tianjin University); Dong Han (Tianjin Academy of Fine Arts)
Show Abstract Download Paper

On-the-fly Data Transformation in Action [demo] Ju Hyoung Mun (Boston University)*; Konstantinos Karatsenidis (Boston University); Tarikul Islam Papon (Boston University); Shahin Roozkhosh (Boston University); Denis Hoornaert (Technical University of Munich); Ahmed Sanaullah (Red Hat); Ulrich Drepper (Red Hat); Renato Mancuso (Boston University); Manos Athanassoulis (Boston University)
Show Abstract Download Paper

Explaining Differentially Private Query Results With DPXPlain [demo] Tingyu Wang (Duke University); Yuchao Tao (SNAP); Amir Gilad (The Hebrew University)*; Ashwin Machanavajjhala (Duke); Sudeepa Roy (Duke University, USA)
Show Abstract Download Paper

Ganos Aero: A Cloud-Native System for Big Raster Data Management and Processing [demo] Fei Xiao (Alibaba Group); Jiong Xie (Alibaba Group)*; Zhida Chen (Alibaba Group); Feifei Li (Alibaba Group); Zhen Chen (Alibaba Corp.); Jianwei Liu (alibaba); Yinpei Liu (Alibaba Group)
Show Abstract Download Paper

Demonstration of OpenDBML, a Framework for Democratizing In-Database Machine Learning [demo] Mahdi Ghorbani (University of Edinburgh); Amir Shaikhha (University of Edinburgh)*
Show Abstract Download Paper

Demonstration of SPARQL-𝑀𝐿: An Interfacing Language for Supporting Graph Machine Learning for RDF Graphs [demo] Hussein Shahata Abdallah (Concordia University)*; Waleed Afandi (Concordia University); Essam Mansour (Concordia University)
Show Abstract Download Paper

Approximate Queries over Concurrent Updates [demo] Congying Wang (University at buffalo); Nithin Sastry Tellapuri (University at Buffalo); Sphoorthi Keshannagari (University at Buffalo); Dylan Zinsley (University at Buffalo); Zhuoyue Zhao (University at Buffalo)*; Dong Xie (Penn State University)
Show Abstract Download Paper

DuckPGQ: Bringing SQL/PGQ to DuckDB [demo] Daniel ten Wolde (Centrum Wiskunde & Informatica)*; Gábor Szárnyas (CWI); Peter Boncz (Centrum Wiskunde & Informatica)
Show Abstract Download Paper

Demo of QueryBooster: Supporting Middleware-based SQL Query Rewriting as a Service [demo] Qiushi Bai (UC Irvine)*; Sadeem Alsudais (UC Irvine); Chen Li (UC Irvine)
Show Abstract Download Paper

Portals: A Showcase of Multi-Dataflow Stateful Serverless [demo] Jonas Spenger (KTH Royal Institute of Technology)*; Chengyang Huang (KTH Royal Institute of Technology); Philipp Haller (KTH Royal Institute of Technology); Paris Carbone (KTH Royal Institute of Technology)
Show Abstract Download Paper

XDB in Action: Decentralized Cross-Database Query Processing for Black-Box DBMSes [demo] Haralampos Gavriilidis (Technische Universität Berlin)*; Leonhard Rose (Technische Universität Berlin); Joel Ziegler (Technische Universität Berlin); Kaustubh Beedkar (IIT Delhi); Jorge-Arnulfo Quiané-Ruiz (IT University of Copenhagen); Volker Markl (Technische Universität Berlin)
Show Abstract Download Paper

Demo-Group-B

QO-Insight: Inspecting Steered Query Optimizers [demo] Christoph Anneser (Technical University of Munich)*; Mario Petruccelli (TUM); Nesime Tatbul (Intel Labs and MIT); David E Cohen (Intel); Zhenggang Xu (Meta Platforms); Prithviraj P Pandian (Meta); Nikolay Laptev (Facebook); Ryan C Marcus (Massachusetts Institute of Technology); Alfons Kemper (TUM)
Show Abstract Download Paper

Demonstrating Waffle: A Self-driving Grid Index [demo] Dalsu Choi (Korea University); Hyunsik Yoon (Korea University); Hyubjin Lee (Korea University); Yon Dohn Chung (Korea University)*
Show Abstract Download Paper

CM-Explorer: Dissecting Data Ingestion Problems [demo] Niels Bylois (Hasselt University)*; Frank Neven (Hasselt University); Stijn Vansummeren (Hasselt University)
Show Abstract Download Paper

Solving Hard Variants of Database Schema Matching on Quantum Computers [demo] Kristin Fritsch (University of Passau); Stefanie Scherzinger (University of Passau)*
Show Abstract Download Paper

To UDFs and Beyond: Demonstration of a Fully Decomposed Data Processor for General Data Wrangling Tasks [demo] Nico Schäfer (RPTU Kaiserslautern-Landau); Damjan Gjurovski (RPTU Kaiserslautern-Landau)*; Angjela Davitkova (RPTU Kaiserslautern-Landau); Sebastian Michel (RPTU Kaiserslautern-Landau)
Show Abstract Download Paper

BrewER: Entity Resolution On-Demand [demo] Luca Zecchini (Università degli Studi di Modena e Reggio Emilia)*; Giovanni Simonini (University of Modena and Reggio Emilia); Sonia Bergamaschi (Università di Modena e Reggio Emilia); Felix Naumann (Hasso Plattner Institute, University of Potsdam)
Show Abstract Download Paper

Web Connector: A Unified API Wrapper to Simplify Web Data Collection [demo] Weiyuan Wu (Simon Fraser University)*; Pei Wang (Simon Fraser University); Yi Xie (Simon Fraser University); Yejia Liu (Simon Fraser University); George Chow (Simon Fraser University); Jiannan Wang (Simon Fraser University)
Show Abstract Download Paper

ERICA: Query Refinement for Diversity Constraint Satisfaction [demo] Jinyang Li (University of Michigan)*; Alon Silberstein (Ben Gurion University); Yuval Moskovitch (Ben Gurion University); Julia Stoyanovich (New York University); H. V. Jagadish (University of Michigan)
Show Abstract Download Paper

DataRinse: Semantic Transforms for Data preparation based on Code Mining [demo] Ibrahim Abdelaziz (IBM Research); Julian Dolby (IBM Research); Udayan Khurana (IBM Research); Horst Samulowitz (IBM Research); Kavitha Srinivas (IBM Research)*
Show Abstract Download Paper

Demonstrating ADOPT: Adaptively Optimizing Attribute Orders for Worst-Case Optimal Joins via Reinforcement Learning [demo] Junxiong Wang (Cornell University)*; Mitchell E Gray (Cornell); Immanuel Trummer (Cornell); Ahmet Kara (University of Zurich); Dan Olteanu (University of Zurich)
Show Abstract Download Paper

Demonstrating GPT-DB: Generating Query-Specific and Customizable Code for SQL Processing with GPT-4 [demo] Immanuel Trummer (Cornell University)*
Show Abstract Download Paper

PikePlace: Generating Intelligence for Marketplace Datasets [demo] Shi Qiao (SmartApps); Alekh Jindal (SmartApps)*
Show Abstract Download Paper

Demo-Group-C

PAINE Demo: Optimizing Video Selection Queries With Commonsense Knowledge [demo] Wenjia He (University of Michigan)*; Ibrahim Sabek (Massachusetts Institute of Technology); Yuze Lou (University of Michigan); Michael Cafarella (MIT CSAIL)
Show Abstract Download Paper

DeepVQL: Deep Video Queries on PostgreSQL [demo] Dong June Lew (Kunsan National University); Kihyun Yoo (Kunsan National University); Kwang Woo Nam (Kunsan National University, South Korea)*
Show Abstract Download Paper

EQUI-VOCAL Demonstration: Synthesizing Video Queries from User Interactions [demo] Enhao Zhang (University of Washington)*; Maureen Daum (University of Washington); Dong He (University of Washington); Manasi Ganti (University of Washington Seattle); Brandon Haynes (Microsoft Gray Systems Lab); Ranjay Krishna (University of Washington); Magdalena Balazinska (UW)
Show Abstract Download Paper

Interpretable Clustering of Multivariate Time Series with Time2Feat [demo] Angela Bonifati (University of Lyon); Francesco Del Buono (University of Modena e Reggio Emilia); Francesco Guerra (University of Modena e Reggio Emilia)*; Miki Lombardi (Adobe); Donato Tiano (Università degli Studi di Modena e Reggio Emilia)
Show Abstract Download Paper

mlwhatif: What If You Could Stop Re-Implementing Your Machine Learning Pipeline Analyses Over and Over? [demo] Stefan Grafberger (University of Amsterdam)*; Shubha Guha (University of Amsterdam); Paul Groth (University of Amsterdam); Sebastian Schelter (University of Amsterdam)
Show Abstract Download Paper

VisualNeo: Bridging the Gap between Visual Query Interfaces and Graph Query Engines [demo] Kai Huang (HKUST)*; Houdong Liang (Hong Kong University of Science and Technology); Chongchong Yao (Hong Kong University of Science and Technology); Xi Zhao (The Hong Kong University of Science and Technology); Yue Cui (The Hong Kong University of Science and Technology); Yao Tian (The Hong Kong University of Science and Technology); Ruiyuan Zhang (The Hong Kong university of Science and Technology); Xiaofang Zhou (Hong Kong University of Sci and Tech)
Show Abstract Download Paper

KG-Roar: Interactive Datalog-based Reasoning on Virtual Knowledge Graphs [demo] Luigi Bellomarini (Banca d'Italia)*; Marco Benedetti (Banca d'Italia); Andrea Gentili (Banca d'Italia); Davide Magnanimi (Politecnico di Milano); Emanuel Sallinger (TU Wien)
Show Abstract Download Paper

Visualizing Spreadsheet Formula Graphs Compactly [demo] Fanchao Chen (Fudan University); Dixin Tang (University of California at Berkeley)*; Haotian Li (The Hong Kong University of Science and Technology); Aditya G. Parameswaran (University of California at Berkeley)
Show Abstract Download Paper

FS-Real: A Real-World Cross-Device Federated Learning Platform [demo] Dawei Gao (Alibaba-inc); Daoyuan Chen (Alibaba Group)*; Zitao Li (Alibaba Group); Yuexiang Xie (Alibaba Group); Xuchen Pan (Alibaba Group); Yaliang Li (Alibaba Group); Bolin Ding (Data Analytics and Intelligence Lab, Alibaba Group); Jingren Zhou (Alibaba Group)
Show Abstract Download Paper

Odyssey: An Engine Enabling The Time-Series Clustering Journey [demo] John Paparrizos (The Ohio State University)*; Sai Prasanna Teja Reddy (Exelon Utilities)
Show Abstract Download Paper

SHEVA: A Visual Analytics System for Statistical Hypothesis Exploration [demo] Vicente N de Almeida (UFRGS)*; Eduardo Ribeiro (Universidade Federal do Tocantins); Nassim Bouarour (CNRS, University Grenoble Alpes); Joao Luiz Dihl Comba (UFRGS); Sihem Amer-Yahia (CNRS)
Show Abstract Download Paper

Join Order Selection with Deep Reinforcement Learning: Fundamentals, Techniques, and Challenges [tutorial] Zhengtong Yan (University of Helsinki)*; Valter Uotila (University of Helsinki); Jiaheng Lu (University of Helsinki)
Show Abstract Download Paper