Title: A Tale of Two Join Algorithms
Abstract:
In this talk we will talk about two new database join algorithms with provable optimality guarantees.
We first present a recent algorithm (PODS 2012) that is the first provably optimal (worst-case) algorithm to compute database joins. As a special case, this join algorithm implies (i) The first algorithmic versions of some well-known geometric inequalities due to Loomis and Whitney (and their generalizations by Bollobas and Thomason); (ii) Algorithmically list recoverable codes that work with parameters that no known algorithmic list recovery result work with (e.g. those based on the Reed-Solomon codes) and an application of this result in designing sublinear time decodable compressed sensing schemes; (ii) Worst-case optimal algorithm to list all occurrences of any fixed hypergraph H in a given large hypergraph G.
The second algorithm (PODS 2014) has stronger optimality guarantees: we present an adaptive join algorithm whose run time depends on the "difficulty" of the data. We believe that this algorithm has more practical applications since worst-case optimal algorithms might have terrible performance on "real data." As a special case, we present an (almost) instance optimal algorithm (with respect to comparison based algorithms) for a large class of join queries (namely Fagin's \beta-acyclic queries).
The talk will be self-contained and is based on join(t) works with Ngo, Nguyen, Porat and Re.