Efficient Query Processing Under Updates

Modern computing tasks such as real-time analytics require constant refresh of query results under high update rates. Incremental View Maintenance (IVM) approaches this problem by materializing results in order to avoid recomputations. IVM naturally induces a trade-off between the space needed to maintain the materialized results and the time used to process updates. In this talk we will discuss a new approach for evaluating queries under updates. Instead of the materialization of results, we aim to maintain a data structure featuring (1) efficient maintenance under updates, (2) constant-delay enumeration of the output, (3) constant-time lookups in the output, while (4) using only linear space in the size of the database. We will describe DYN, a dynamic version of the Yannakakis algorithm, and show that it yields such data structures for the class of free-connex acyclic marginalize-join queries. We show that this is optimal in the sense that such a data structure cannot exist for marginalize-join queries that are not free-connex acyclic. In addition, we identify a sub-class of queries for which DYN features constant-time update per tuple. We introduce a cost model for DYN under different join trees, and describe how to evaluate under updates the more general class of acyclic aggregate join queries. Finally, using the industry-standard benchmarks TPC-H and TPC-DS, we experimentally compare DYN and a higher-order IVM (HIVM) engine.