SpaceSearch: A Library for Building and Verifying Solver-Aided Tools

by Konstantin Weitz, Steven Lyubomirsky, Stefan Heule, Emina Torlak, Michael D Ernst and Zachary Tatlock

22nd ACM SIGPLAN International Conference on Functional Programming
September 3-9, 2017, Oxford, United Kingdom

Materials

Abstract

Many verification tools build on automated solvers. These tools reduce problems in a specific application domain (e.g., compiler optimization validation) to queries that can be discharged with a highly optimized solver. But the correctness of the reductions themselves is rarely verified in practice, limiting the confidence that the solver's output establishes the desired domain-level property.
This paper presents SpaceSearch, a new library for developing solver-aided tools within a proof assistant. A user builds their solver-aided tool in Coq against the SpaceSearch interface, and the user then verifies that the results provided by the interface are sufficient to establish the tool's desired high-level properties. Once verified, the tool can be extracted to an implementation in a solver-aided language (e.g., Rosette), where SpaceSearch provides an efficient instantiation of the SpaceSearch interface with calls to an underlying SMT solver. This combines the strong correctness guarantees of developing a tool in a proof assistant with the high performance of modern SMT solvers. This paper also introduces new optimizations for such verified solver-aided tools, including parallelization and incrementalization.
We evaluate SpaceSearch by building and verifying two solver-aided tools. The first, SaltShaker, checks that RockSalt's x86 semantics for a given instruction agrees with STOKE's x86 semantics. SaltShaker identified 7 bugs in RockSalt and 1 bug in STOKE. After these systems were patched by their developers, SaltShaker verified the semantics' agreement on 15,255 instruction instantiations in under 2h. The second tool, BGProof, is a verified version of an existing Border Gateway Protocol (BGP) router configuration checker. Like the existing checker, BGProof scales to checking industrial configurations spanning over 240 KLOC, identifying 19 configuration inconsistencies with no false positives. However, the correctness of BGProof has been formally proven, and we found 2 bugs in the unverified implementation. These results demonstrate that SpaceSearch is a practical approach to developing efficient, verified solver-aided tools.