# Research

### Overall Objectives

The VeriDis project team includes members of the MOSEL group at LORIA, the computer science laboratory in Nancy, and members of the research group Automation of Logic at Max-Planck-Institut für Informatik in Saarbrücken. It is headed by Stephan Merz and Christoph Weidenbach. VeriDis was created in 2010 as a local research group of Inria Nancy – Grand Est and has been an Inria project team since July 2012.

The objectives of VeriDis are to contribute to advances in verification techniques, including automated and interactive theorem proving, and to make them available for the formal development and analysis of concurrent and distributed algorithms and systems, within the framework of mathematically precise and practically applicable development methods. We intend to assist designers of algorithms and systems in carrying out formally proved developments, where proofs of relevant properties, as well as bugs, can be found with a high degree of automation.

Verification techniques based on theorem proving are already having substantial impact. In particular, they have been successfully applied to the verification and analysis of sequential programs, often in combination with static analysis and software model checking. Ideally, systems and their properties would be specified in high-level, expressive languages, errors in specifications would be discovered automatically, and finally, full verification could also be performed completely automatically. Due to the inherent complexity of the problem, this cannot be achieved in general. We have, however, observed significant advances in theorem proving in recent years. We are particularly interested in the integration of different deduction techniques and tools, such as automated theorem proving for relevant theories, such as different fragments of arithmetic. These advances suggest that a substantially higher degree of automation can be achieved in system verification than what is available in today’s verification tools.

VeriDis aims at exploiting and further developing automation in system verification, and at applying its techniques within the context of concurrent and distributed algorithms, which are by now ubiquitous and whose verification is a big challenge. Concurrency problems are central for the development and verification of programs for multi- and many-core architectures, and distributed computation underlies the paradigms of grid and cloud computing. The potential of distributed systems for increased resilience to component failures makes them attractive in many contexts, but also makes formal verification important and challenging. We aim at moving current research in this area to a new level of productivity and quality. To give a concrete example: today the designer of a new distributed protocol may validate it using testing or model checking. Model checking will help finding bugs, but can only guarantee properties of a high-level model of the protocol, usually restricted to finite instances. Testing distributed systems and protocols is notoriously difficult because corner cases are hard to establish and reproduce. Also, many testing techniques require an executable, whose production is expensive and time-consuming, and since an implementation is needed, errors are found only when they are expensive to fix. The techniques that we develop aim at automatically proving significant properties of the protocol as early as during the design phase. Our methods mainly target designs and algorithms at high levels of abstraction; we aim at components of operating systems, distributed services, and down to the (mobile) network systems industry.

## Results

#### Extension of the Superposition Calculus with $\lambda$ -free Higher-Order Terms and (Co)datatypes

Joint work with Alexander Bentkamp (VU Amsterdam), Simon Cruanes (Aesthetic Integration), Nicolas Peltier (IMAG Grenoble), and Simon Robillard (Chalmers Gothenburg).

Superposition is a highly successful calculus for reasoning about first-order logic with equality. As a stepping stone towards extending the calculus to full higher-order logic, Bentkamp et al. [article] designed a graceful generalization of the calculus to a fragment devoid of $\lambda$ -abstractions, but with partial application and application of variables, two crucial higher-order features. This builds on the work on term orders, namely the recursive path order [article] and the Knuth-Bendix order [article]. We implemented the calculi in Simon Cruanes’s Zipperposition prover and evaluated them on TPTP benchmarks. The performance is substantially better than with the traditional, encoding-based approach. The new superposition-like calculus serves as a stepping stone towards complete, efficient automatic theorem provers for full higher-order logic.

Another extension of superposition, by Blanchette et al.  [article], concerns the native support for inductive and coinductive datatypes. The ability to reason about datatypes has many applications in program verification, formalization of the metatheory of programming languages, and even formalization of mathematics.

Both lines of work aim at bridging the gap between automatic and interactive theorem provers, by increasing the expressiveness and efficiency of best-of-breed automatic first-order provers based on the superposition calculus.

#### IsaFoL: Isabelle Formalization of Logic

Joint work with Alexander Bentkamp (VU Amsterdam), Andreas Halkjær From (DTU Copenhagen), Alexander Birch Jensen (DTU Copenhagen), Peter Lammich (TU München), John Bruntse Larsen (DTU Copenhagen), Julius Michaelis (TU München), Tobias Nipkow (TU München), Nicolas Peltier (IMAG Grenoble), Simon Robillard (Chalmers Gothenburg), Anders Schlichtkrull (DTU Copenhagen), Dmitriy Traytel (ETH Zürich), Jørgen Villadsen (DTU Copenhagen), and Petar Vukmirović (VU Amsterdam).

Researchers in automated reasoning spend a significant portion of their work time specifying logical calculi and proving metatheorems about them. These proofs are typically carried out with pen and paper, which is error-prone and can be tedious. As proof assistants are becoming easier to use, it makes sense to employ them.

In this spirit, we started an effort, called IsaFoL (Isabelle Formalization of Logic), that aims at developing libraries and methodology for formalizing modern research in the field, using the Isabelle/HOL proof assistant.https://bitbucket.org/isafol/isafol/wiki/Home Our initial emphasis is on established results about propositional and first-order logic.

The main result this year has been a formalization of a large part of Bachmair and Ganzinger’s chapter on resolution theorem proving in the Handbook of Automated Reasoning, by Anders Schlichtkrull et al. The work was conducted by Schlichtkrull largely during a visit at the MPI in Saarbrücken and was published at IJCAR 2018 [article]. The following quote of one of the reviews nicely sums up the objective of the project:

The authors convinced me that their development is a great tool for exploring/developing calculus extensions. It will enable us to “extend/hack without fear.”

A follow-up paper [article], also by Schlichtkrull et al., has been accepted at CPP 2019. In this work, a chain of refinement leads to a verified executable prover.

The IsaFoL repository has welcome several further additions in 2018, and there is largely finished work, which we expect will lead to at least two publications in 2019:

• After the journal publication [article] following up on an IJCAR 2016 paper and a publication at CPP 2018 [article], Fleury has improved his verified SAT solver IsaSAT further by implementing four optimizations: restarts, forgetting, blocking literals, and machine integers. IsaSAT is now by far the most efficient verified SAT solver, and it is catching up with MiniSat, a reference (but unverified) SAT solver implementation.

• Sophie Tourret and Simon Robillard have formalized a new framework, designed primarily by Uwe Waldmann, that captures abstractly the lifting from completeness of a calculus for propositional logic to a first-order prover. This will yield a simpler proof of Bachmair and Ganzinger’s completeness theorem and will be reusable for reasoning about other provers (e.g., superposition provers), whether with pen and paper or in Isabelle.

Jasmin Blanchette briefly describes this ongoing research in an invited paper [article], which he will present at CPP 2019.

#### Subtropical Reasoning for Real Inequalities

Joint work with Hoon Hong (North Carolina State University, Raleigh, NC).

We consider systems of strict multivariate polynomial inequalities over the reals. All polynomial coefficients are parameters ranging over the reals, where for each coefficient we prescribe its sign. We are interested in the existence of positive real solutions of our system for all choices of coefficients subject to our sign conditions. We give a decision procedure for the existence of such solutions. In the positive case our procedure yields a parametric positive solution as a rational function in the coefficients. Our framework allows heuristic subtropical approaches to be reformulated for non-parametric systems of polynomial inequalities. Such systems have been recently used in qualitative biological network analysis and, independently, in satisfiability modulo theory solving. We apply our results to characterize the incompleteness of those methods.

The approach allows SMT solving for non-linear real arithmetic to be heuristically reduced to linear real arithmetic, to which, e.g., methods from 60 are applicable. In the special case of single inequalities one can even reduce to linear programming. [article]. This has been successfully applied to heuristic search for Hopf bifurcation fixed points in chemical and biological network analysis.

#### Reasoning in Linear Arithmetic

We have continued our work on reasoning in linear integer (LIA), linear real (LRA) and linear mixed arithmetic (LIRA). Whereas the standard branch-and-bound techniques [article] for LIA typically work well for bounded systems of inequations, they often diverge on unbounded systems. We already proposed cube techniques for this case. They comprise efficiently computable sufficient tests for the existence of a solution [article]. However, these tests are only necessary for the existence of a solution in the case of a system that is unbounded in all directions. For the case of partially unbounded systems, our combination of the Mixed-Echelon-Hermite transformation and the Double-Bounded Reduction for systems of linear mixed arithmetic preserve satisfiability, can be computed in polynomial time, and turn any LIRA system into a bounded system [article]. Existing approaches for LIRA, e.g., branch-and-bound and cuts from proofs, only explore a finite search space after the application of our two transformations. The transformations orient themselves on the structure of an input system instead of computing a priori (over-)approximations out of the available constants. We also developed a polynomial method for converting certificates of (un)satisfiability from the transformed to the original system.

Meanwhile our techniques have been integrated into the SMT solver veriT, but also in other SMT solvers such as Z3 [article] or MathSAT [article]. They have been substantial for our success at SMTComp2018.

#### Combination of Satisfiability Procedures

Joint work with Christophe Ringeissen (Inria Nancy – Grand Est, Pesto) and Paula Chocron (IIIA-CSIC, Bellaterra, Spain).

A satisfiability problem is often expressed in a combination of theories, and a natural approach consists in solving the problem by combining the satisfiability procedures available for the component theories. This is the purpose of the combination method introduced by Nelson and Oppen. However, in its initial presentation, the Nelson-Oppen combination method requires the theories to be signature-disjoint and stably infinite. The design of a generic combination method for non-disjoint unions of theories is difficult, but it is worth exploring simple non-disjoint combinations that appear frequently in verification. An example is the case of shared sets, where sets are represented by unary predicates. Another example is the case of bridging functions between data structures and a target theory (e.g., a fragment of arithmetic).

In 2015, we defined a sound and complete combination procedure à la Nelson-Oppen for the theory of absolutely free data structures (including lists and trees) connected to another theory via bridging functions [article]. This combination procedure has also been refined for standard interpretations. The resulting theory has a nice politeness property, enabling combinations with arbitrary decidable theories of elements. We also investigated other theories [article] amenable to similar combinations: this class includes the theory of equality, the theory of absolutely free data structures, and all the theories in between.

In 2018, we have been improving the framework and unified both results. A paper is under review.

#### Quantifier Handling in SMT

Joint work with Andrew J. Reynolds (Univ. of Iowa, USA) and Cezary Kaliszyk (Univ. of Innsbruck).

SMT solvers generally rely on various instantiation techniques for handling quantifiers. We built a unifying framework encompassing quantified formulas with equality and uninterpreted functions, such that the major instantiation techniques in SMT solving can be cast in that framework. It is based on the problem of E-ground (dis)unification, a variation of the classic Rigid E-unification problem. We introduced a sound and complete calculus to solve this problem in practice: Congruence Closure with Free Variables (CCFV). Experimental evaluations of implementations of CCFV demonstrate notable improvements in the state-of-the-art solver CVC4 and make the solver veriT competitive with state-of-the-art solvers for several benchmark libraries, in particular those originating in verification problems. This was the subject of a publication in 2017 [article]. In a publication at TACAS 2018 [article], we revisit enumerative instantiation for SMT.

We are currently investigating machine learning techniques as a tool for filtering instantiations. Other ongoing work aims at lifting the above techniques to higher-order reasoning.

#### Real Quantifier Elimination, Decision, and Satisfiability and Their Applications

Effective quantifier elimination procedures for first-order theories provide a powerful tool for generically solving a wide range of problems based on logical specifications. In contrast to general first-order provers, quantifier elimination procedures are based on a fixed set of admissible logical symbols with an implicitly fixed semantics. This admits the use of sub-algorithms from symbolic computation. Specifically quantifier elimination for the reals has been successfully applied in geometry, verification, and the life sciences.

A survey paper with an invited talk at ISSAC 2018 provides a coherent view on the scientific developments of the virtual substitution method for real quantifier elimination during the past three decades [article]. Another recent survey paper had illustrated relevant applications of that method [article].

#### Non-Linear Arithmetic in SMT

Joint work with M. Ogawa and X. T. Vu (Japan Advanced Institute of Science and Technology), V. K. To (University of Engineering and Technology, VNU, Hanoi, Vietnam).

In the context of the SC ${}^{2}$ project (cf. sections 78 and 159), we study the theory, design techniques, and implement software to push forward the non-linear arithmetic (NLA) reasoning capabilities in SMT. Previously, we designed a framework to combine interval constraint propagation with other decision procedures for NLA, with promising results, notably in the international competition of SMT solvers. We also studied integration of these procedures into combinations of theories. These ideas were validated through an implementation within the veriT solver, together with code from the raSAT solver (from JAIST), and they were presented at the SC ${}^{2}$ workshop 2018 [article].

#### Proofs for SMT

We have previously developed a framework for processing formulas in automatic theorem provers, with generation of detailed proofs. The main components are a generic contextual recursion algorithm and an extensible set of inference rules. Clausification, skolemization, theory-specific simplifications, and expansion of `let’ expressions are instances of this framework. With suitable data structures, proof generation adds only a linear-time overhead, and proofs can be checked in linear time. We implemented the approach in the SMT solver veriT. This allowed us to dramatically simplify the code base while increasing the number of problems for which detailed proofs can be produced, which is important for independent checking and reconstruction in proof assistants. This was the subject of a conference publication in 2017. In 2018, we polished the approach, fully implementing proof reconstruction of veriT proofs in Isabelle. A paper has been accepted in the Journal of Automated Reasoning.

#### A More Efficient Technique for Validating Cyclic Pre-Proofs

Cyclic pre-proofs can be represented as sets of finite tree derivations with back-links. In a setting of first-order logic with inductive definitions, the nodes of the tree derivations are labelled by sequents and the back-links connect particular terminal nodes, referred to as buds, to other nodes labelled by the same sequent. However, only some back-links can constitute sound pre-proofs. Previously, it was shown that special ordering and derivability conditions, defined along the minimal cycles of the digraph representing a particular normal form of the cyclic pre-proof, are sufficient for validating the back-links. In that approach, a single constraint could be checked several times when processing different minimal cycles, hence one may require additional recording mechanisms to avoid redundant computation in order to achieve polynomial time complexity.

In  [article], we presented a new approach that does not need to process minimal cycles. It is based on a normal form in which the validation conditions are defined by taking into account only the root-bud paths from the non-singleton strongly connected components of its digraph.

#### Mechanical Synthesis of Algorithms by Logical and Combinatorial Techniques

Joint work with Isabela Dramnesc (West University, Timisoara, Romania) and Tudor Jebelean (RISC, Johannes Kepler University, Linz, Austria).

In [article], we developed logical and combinatorial methods for automating the generation of sorting algorithms for binary trees, starting from input-output specifications and producing conditional rewrite rules. The main approach consists in proving (constructively) the existence of an appropriate output from every input. The proof may fail if some necessary sub-algorithms are lacking. Then, their specifications are suggested and their synthesis is performed by the same principles.

The main goal is to avoid the possibly prohibitive cost of pure resolution proofs by using a natural-style proving in which domain-specific strategies and inference steps lead to a significant increase of efficiency. We introduce novel techniques and combine them with classical techniques for natural-deduction style proving, as well as methods based on the properties of domain-specific relations and functions. In particular, we use combinatorial techniques in order to generate possible witnesses, which in certain cases lead to the discovery of new induction principles. From the proof, the algorithm is extracted by transforming inductive proof steps into recursions, and case-based proof steps into conditionals.

The approach was demonstrated using the Theorema system for developing the theory, implementing the prover, and performing the proofs of the necessary properties and synthesis conjectures. It was also validated in the Coq system, allowing us to compare the facilities of the two systems in view of our application.

#### Formal Proofs of Tarjan’s Algorithm

Joint work with Ran Chen (Chinese Academy of Sciences), Cyril Cohen and Laurent Théry (Inria Sophia Antipolis Méditerranée, Marelle), and Jean-Jacques Lévy (Inria Paris, Pi.r2).

We compare formal proofs of Tarjan’s algorithm for computing strongly connected components in a graph in three different proof assistants: Coq, Isabelle/HOL, and Why3. Our proofs are based on a representation of the algorithm as a functional program (rather than its more conventional imperative representation), which was verified in Why3 by Chen and Lévy [article]. The proofs in all three assistants are thus closely comparable and in particular employ the same invariants. This lets us focus on different formalizations due to idiosyncracies of the proof assistants, such as w.r.t. handling mutually recursive function definitions whose termination is not obvious according to syntactic criteria, and compare the degree of automation in the three assistants. A report is available on arXiv [article].

#### Parameterized Verification of Threshold-Guarded Fault-Tolerant Distributed Algorithms

Joint work with Nathalie Bertrand (Inria Rennes, SUMO project team) and Jure Kukovec, Marijana Lazić, Ilina Stoilkovska, Josef Widder, Florian Zuleger (TU Wien).

Many fault-tolerant distributed algorithms use threshold guards: processes broadcast messages and count the number of messages that they receive from their peers. Based on the total number $n$ of processes and an upper bound on the number $t$ of faulty processes, a correct process tolerates faults by receiving “sufficiently many” messages. For instance, when a correct process has received $t+1$ messages from distinct processes, at least one of these messages must originate from a non-faulty process. The main challenge is to verify such algorithms for all combinations of parameters $n$ and $t$ that satisfy a resilience condition, e.g., $n>3t$ .

In earlier work, we introduced threshold automata for representing processes in such algorithms and showed that systems of threshold automata have bounded diameters that do not depend on the parameters such as  $n$ and  $t$ , provided that a single-step acceleration is allowed [article]. In the contribution [article] to CONCUR’18, we reported on various extensions of this result to less restrictive forms of automata: the guards can be non-linear, shared variables can be incremented and decremented, non-trivial loops are allowed, and more general forms of acceleration are used. In the contribution [article] to ISOLA’18, we presented a parallel extension of our tool Byzantine Model Checker (ByMC), which allows one to distribute the verification queries across the computation nodes in an MPI cluster.

Our previous results apply to asynchronous algorithms. It is well-known that distributed consensus cannot be solved in purely asynchronous systems [article]. However, when an algorithm is provided with a random coin, consensus becomes solvable [article]. In [article], we introduced an approach to parameterized verification of randomized threshold-guarded distributed algorithms, which proceed in an unbounded number of rounds and toss a coin to break symmetries. This approach integrates two levels of reasoning: (1) proving safety and liveness of a single round system with ByMC by replacing randomization with non-determinism, (2) showing almost-sure termination of an algorithm by using the verification results for the non-deterministic system. To show soundness, we proved several theorems that reduce reasoning about multiple rounds to reasoning about a single round. We verified five prominent algorithms, including Ben-Or’s randomized consensus [article] and randomized one-step consensus (RS-BOSCO [article]). The verification of the latter algorithm required us to run experiments in Grid5000. A paper describing these results is under review at TACAS 2019.

Another way of making consensus solvable is to impose synchrony on the executions of a distributed system. In [article] we introduced synchronous threshold automata, which execute in lock-step and count the number of processes in given local states. In general, we showed that even reachability of a parameterized set of global states in such a distributed system is undecidable. However, we proved that systems of automata with monotonic guards have bounded diameters, which allows us to use SMT-based bounded model checking as a complete parameterized verification technique. We introduced a procedure for computing the diameter of a counter system of synchronous threshold automata, applied it to the counter systems of 8 distributed algorithms from the literature, and found that their diameters are tiny (from 1 to 4). This makes our approach practically feasible, despite undecidability in general. A paper about this work is under review at TACAS 2019.

#### Symbolic Model Checking of TLA+ Specifications

Joint work with Jure Kukovec, Thanh Hai Tran, Josef Widder (TU Wien).

TLA+ is a general language introduced by Leslie Lamport for specifying temporal behavior of computer systems [article]. The tool set for TLA+ includes an explicit-state model checker tlc. As explicit state model checkers do not scale to large verification problems, we started the project APALACHE WWTF project APALACHE (ICT15-103): https://forsyte.at/research/apalache/ on developing a symbolic model checker for TLA+ in 2016.

In the contribution [article] to ABZ’18, we addressed the first principal challenge towards developing the symbolic model checker. We introduced a technique for identifying assignments in TLA+ specifications and decomposing a monolithic TLA+ specification into a set of symbolic transitions. At the TLA+ community meeting 2018, we presented a prototype solution [article] to a second challenge. We have developed an SMT encoding of TLA+ expressions for model checking purposes. We presented the first version of a symbolic model checker for TLA+ specifications that works under the same assumptions as TLC: the input parameters are fixed and finite structures, and the reachable states are finite structures. The experimental results are encouraging, and we are thus preparing a conference submission. Finally, in a contribution to the DSN Workshop on Byzantine Consensus and Resilient Blockchains [article], we considered challenges for automatic verification techniques for Blockchain protocols.

#### Making Explicit Domain Knowledge in Formal System Development

Joint work with partners of the IMPEX project.

The IMPEX project (cf. section 78) advocates that formal modeling languages should explicitly represent the knowledge resulting from an analysis of the application domain, and that ontologies are good candidates for handling explicit domain knowledge. We strive at offering rigorous mechanisms for handling domain knowledge in design models. The main results of the project are summarized in [article] and show the importance of three operations over models, namely annotation, dependency, and refactoring [article].

#### Incremental Development of Systems and Algorithms

Joint work with Manamiary Bruno Andriamiarina, Neeraj Kumar Singh (IRIT, Toulouse), Rosemary Monahan (NUI Maynooth, Ireland), Zheng Cheng (LINA, Nantes), and Mohammed Mosbah (LaBRI, Bordeaux).

The development of distributed algorithms and, more generally, of distributed systems, is a complex, delicate, and challenging process. The approach based on refinement applies a design methodology that starts from the most abstract model and leads, in an incremental way, to a distributed solution. The use of a proof assistant gives a formal guarantee about the conformance of each refinement with the model preceding it. Our main result during 2018 is the development of patterns for different kinds of paradigms including the iterative pattern, the recursive pattern, and the distributed pattern [article].

#### Synthesis of Security Chains for Software Defined Networks

Joint work with Rémi Badonnel and Abdelkader Lahmadi of the Resist team of Inria Nancy – Grand Est.

The PhD work of Nicolas Schnepf focuses on applying formal methods techniques in the area of network communications, and in particular for the construction, analysis, and optimization of security functions in the setting of software-defined networks (SDN). In previous work, we defined an extension of the Pyretic language [article] for representing both the control and the data planes of SDN controllers and implemented a translation of that extension to the input languages of the nuXmv model checker and of SMT solvers.

This year, our work focused on synthesizing security chains for Android applications based on their observed communications. The first step consists in inferring probabilistic finite-state automata models that represent network flows generated by Android applications. Comparing our models with automata produced by the state-of-the-art tools Invarimint and Synoptic, we obtain representations that are significantly smaller than those generated by Synoptic and as succinct as those inferred by Invarimint, but that include information about transition probability, unlike Invarimint. This work was presented at NOMS 2018 [article], [article]. In a second step, we encode security policies defined by network administrators in a rule-based program that is then used to generate a high-level representation of a security chain for the application, which is then translated to Pyretic. For example, an application that contacts different ports at the same IP address in rapid succession could be qualified as performing a port scanning attack, and these connections could then be blocked. This work was presented at AVoCS 2018 [article]. The third step consists in factorizing the chains generated for different applications in order to reduce the size of the overall chain that must be deployed in a network. A paper describing appropriate algorithms for that purpose will be presented at IM 2019.

#### Satisfiability Techniques for Reliability Assessment

Joint work with Nicolae Brînzei at Centre de Recherche en Automatique de Nancy.

The reliability of complex systems is typically assessed using probabilistic methods, based on the probabilities of failures of individual components, relying on graphical representations such as fault trees or reliability block diagrams. Mathematically, the dependency of the overall system on the working status of its components is described by its Boolean-valued structure function, and binary decision diagrams (BDDs) have traditionally been used to construct a succinct representation of that function. We explore the use of modern satisfiability techniques as an alternative to BDD-based algorithms. In 2018, our work focused on the encoding of dynamic fault trees whose structure function needs to take into account the order in which components fail.