
PhD Candidate: Mudathir Mohamed
Abstract
Problems from many industrial applications can be encoded naturally in the languages of finite sets and multisets. Concrete examples include software design specifications, ontologies, relational algebra, database queries, normative requirements, and authorization policies. To reason about these languages, researchers have developed several specialized solvers for the quantifier-free first-order formulas for sets, relations, and multisets and implemented them in SMT solvers.
Solver based on Satisfiability Modulo Theories (SMT) proven to be particularly effective at reasoning about these kinds of problems.
However, some applications require more expressive power than the ones offered by these solvers.
One class of examples is represented by modeling languages with set-bounded quantifiers, that is, quantified variables ranging over finite sets.
Constraints written in these languages have proven hard to solve using general quantification techniques in SMT solvers.
Another class of examples is languages based on relational algebra such as Alloy and SQL which generate constraints containing second-order operators, such as filter and map, not supported by existing solvers.
This thesis proposes a general approach that extends existing relational solvers in SMT with a number of first-order and second-order operators to handle encodings of these languages efficiently.
For the first class, it proposes extending a quantifier-free fragment of relations with a filter operator and using it to encode set-bounded quantifiers. The thesis discusses an extension of a relational SMT solver with support for the filter operator. Experimental results show that this extension outperforms other quantification techniques in satisfiable problems generated by SLEEC, a tool based on relational logic for modeling normative behavior, such as requirements in healthcare.
The extension is also very competitive on unsatisfiable problems when compared to LEGOS, a specialized solver for SLEEC. The thesis additionally identifies a class of constraints with restricted applications of the filter operator whose satisfiability is decidable. It also shows that such restrictions are necessary because unrestricted applications make the satisfiability problem undecidable.
For the second class of languages, this thesis adds further operators such as map and projection and uses them to encode SQL queries based on finite set semantics.
To support queries with multisets semantics, the thesis introduces a theory of tables with filter and map operators as an extension to a theory of finite multisets. Since SQL supports null values in tables, the thesis introduces a theory of nullable sorts based on an extension of the theory of algebraic datatypes.
The thesis integrates novel specialized solvers for the theories above into cvc5, an SMT solver that already supports reasoning in a large number of theories including the theories of arithmetic, datatypes and strings. This extended combination of theories enables a uniform encoding of several classes of SQL verification problems such as query emptiness, containment, and equivalence.
Experimental results show this approach is competitive in proving equivalence of two queries in a fragment of SQL language based on set semantics.
It provides counterexamples for inequivalent queries based on either set or bag semantics.
Advisor: Cesare Tinelli