Provenance Circuits for Trees and Treelike Instances

Query evaluation in monadic second-order logic is known to be tractable on trees and treelike instances, even though it is hard for arbitrary instances. This tractability result has been extended to several tasks related to query evaluation, such as counting query results and query evaluation on probabilistic trees. We see counting and probability evaluation as two examples of the more general problem of computing augmented query output, that is referred to as provenance. In this talk I will present a provenance framework for tree and treelike instances, by describing a linear-time construction of a circuit provenance representation for MSO queries. We show how this provenance can be connected to the usual definitions of semiring provenance on relational instances, even though we compute it in an unusual way, using tree automata; we do so via intrinsic definitions of provenance for general semirings, that are independent from the operational details of query evaluation. We show the applications of this provenance to capture existing counting and probabilistic results on trees and treelike instances, and give novel consequences for probability evaluation.