4 November 2022, 10:30-11:30.
Datalog: Optimizations and Extensions
Modern data analytics requires iteration, yet relational database engines are mostly optimized for non-recursive queries. SQL supports only a limited form of recursion. A better formalism for recursive queries is datalog, which has some elegant properties (recursion always terminates), and led to the development of two powerful optimizations techniques: semi-naive evaluation, and magic set rewriting. But standard datalog is restricted to monotone queries over sets and does not support aggregates, which has limited its adoption.
In this talk I will describe a new approach to recursive queries and their optimization. First, we extend datalog to semirings, while preserving some of the elegant properties of datalog, and also supporting aggregates naturally. Then, I will describe a simple, yet very powerful optimization rule, called the FGH rule, that rewrites a recursive program into a different recursive program. The rule captures many optimizations discussed in the literature, such as magic set optimization, the PreM rule, and semi-naive evaluation, and as well as new semantics optimizations. Our implementation of the FGH rule is based on the egg term rewriting engine, and the Rosette program synthesizer.
Joint work with: Yisu Remy Wang, Mahmoud Abo Khamis, Hung Q. Ngo, Reinhard Pichler