# Research

### Overall Objectives

The VeriDis project team includes members of the MOSEL group at LORIA, the computer science laboratory in Nancy, and members of the research group Automation of Logic at Max-Planck-Institut für Informatik in Saarbrücken. It is headed by Stephan Merz and Christoph Weidenbach. VeriDis was created in 2010 as a local research group of Inria Nancy – Grand Est and has been an Inria project team since July 2012.

The objectives of VeriDis are to contribute to advances in verification techniques, including automated and interactive theorem proving, and to make them available for the formal development and analysis of concurrent and distributed algorithms and systems, within the framework of mathematically precise and practically applicable development methods. We intend to assist designers of algorithms and systems in carrying out formally proved developments, where proofs of relevant properties, as well as bugs, can be found with a high degree of automation.

Verification techniques based on theorem proving are already having substantial impact. In particular, they have been successfully applied to the verification and analysis of sequential programs, often in combination with static analysis and software model checking. Ideally, systems and their properties would be specified in high-level, expressive languages, errors in specifications would be discovered automatically, and finally, full verification could also be performed completely automatically. Due to the inherent complexity of the problem, this cannot be achieved in general. We have, however, observed significant advances in theorem proving in recent years. We are particularly interested in the integration of different deduction techniques and tools, such as automated theorem proving for relevant theories, such as different fragments of arithmetic. These advances suggest that a substantially higher degree of automation can be achieved in system verification than what is available in today’s verification tools.

VeriDis aims at exploiting and further developing automation in system verification, and at applying its techniques within the context of concurrent and distributed algorithms, which are by now ubiquitous and whose verification is a big challenge. Concurrency problems are central for the development and verification of programs for multi- and many-core architectures, and distributed computation underlies the paradigms of grid and cloud computing. The potential of distributed systems for increased resilience to component failures makes them attractive in many contexts, but also makes formal verification important and challenging. We aim at moving current research in this area to a new level of productivity and quality. To give a concrete example: today the designer of a new distributed protocol may validate it using testing or model checking. Model checking will help finding bugs, but can only guarantee properties of a high-level model of the protocol, usually restricted to finite instances. Testing distributed systems and protocols is notoriously difficult because corner cases are hard to establish and reproduce. Also, many testing techniques require an executable, whose production is expensive and time-consuming, and since an implementation is needed, errors are found only when they are expensive to fix. The techniques that we develop aim at automatically proving significant properties of the protocol already during the design phase. Our methods mainly target designs and algorithms at high levels of abstraction; we aim at components of operating systems, distributed services, and down to the (mobile) network systems industry.

## Results

#### IsaFoL: Isabelle Formalization of Logic

Joint work with Andreas Halkjær From (DTU Copenhagen), Alexander Birch Jensen (DTU Copenhagen), Maximilian Kirchmeier (TU München), Peter Lammich (TU München), John Bruntse Larsen (DTU Copenhagen), Julius Michaelis (TU München), Tobias Nipkow (TU München), Nicolas Peltier (IMAG Grenoble) Anders Schlichtkrull (DTU Copenhagen), Dmitriy Traytel (ETH Zürich), and Jørgen Villadsen (DTU Copenhagen).

Researchers in automated reasoning spend a significant portion of their work time specifying logical calculi and proving metatheorems about them. These proofs are typically carried out with pen and paper, which is error-prone and can be tedious. As proof assistants are becoming easier to use, it makes sense to employ them.

In this spirit, we started an effort, called IsaFoL (Isabelle Formalization of Logic), that aims at developing libraries and methodology for formalizing modern research in the field, using the Isabelle/HOL proof assistant.https://bitbucket.org/isafol/isafol/wiki/Home Our initial emphasis is on established results about propositional and first-order logic. In particular, we are formalizing large parts of Weidenbach’s forthcoming textbook, tentatively called Automated Reasoning—The Art of Generic Problem Solving.

The objective of formalization work is not to eliminate paper proofs, but to complement them with rich formal companions. Formalizations help catch mistakes, whether superficial or deep, in specifications and theorems; they make it easy to experiment with changes or variants of concepts; and they help clarify concepts left vague on paper.

The repository contains 14 completed entries and four entries that are still in development. Notably, Mathias Fleury formalized a SAT solver framework with learn, forget, restart, and incrementality. This year he extended it with key optimizations such as the two-watched-literal procedure. The corresponding paper, written together with Jasmin Blanchette and Peter Lammich, was accepted at a highly competitive conference (CPP 2018).

#### Extension of Term Orders to $\lambda$ -Free Higher-Order Logic

Superposition is one of the most successful proof calculi for first-order logic today, but in contrast to resolution, tableaux, and connections, it has not yet been generalized to higher-order logic (also called simple type theory). Yet, most proof assistants and many specification languages are based on some variant of higher-order logic.

This motivates us to design a graceful generalization of superposition: a proof calculus that behaves like standard superposition on first-order problems and that smoothly scales up to arbitrary higher-order problems. A challenge is that superposition relies on a simplification order, which is fixed in advance of the proof attempt, to prune the search space.

We started our investigations by focusing on a fragment devoid of $\lambda$ -abstractions, but with partial application and application of variables, two crucial higher-order features. We generalized the two main orders that are used in superposition-based provers today—the lexicographic path order (LPO) [article] and the Knuth-Bendix order (KBO) [article]. The new orders gracefully generalize their first-order counterparts and enjoy nearly all properties needed for superpositions. An exception is compatibility with contexts, which is missing for LPO and some KBO variants. Preliminary work suggests that we can define a version of the superposition calculus that works well in theory and practice (i.e., is refutationally complete and does not lead to a search-space explosion) despite the missing property.

#### A Fine-Grained Approach of Understanding First-Order Logic Complexity

By the introduction of the separated fragment [article] we have initiated a new framework for a fine-grained understanding of the complexity of fragments of first-order logic, with and without the addition of theories. We have related the classes of the polynomial hierarchy to subclasses of the separated fragment [article] and developed new decidability results [article], [article] based on the techniques of our framework for the combination of the Bernays-Schoenfinkel subfragment with linear arithmetic.

#### Theorem Proving Based on Approximation-Refinement into the Monadic Shallow Linear Fragment with Straight Dismatching Constraints

We have introduced an approximation-refinement approach for first-order theorem proving based on counterexample-guided abstraction refinement [article]. A given first-order clause set is transformed into an over-approximation contained in the fragment of monadic, shallow, linear clauses with straight dismatching constraints. We have shown the fragment to be decidable, strictly extending known results. If the abstraction obtained that way is satisfiable, so is the original clause set. However, if it is unsatisfiable, then the approximation provides a terminology for lifting the found refutation, step by step, into a proof for the original clause set. If lifting fails, the cause is analyzed to refine the original clause set such that the found refutation is ruled out for the future, and the procedure repeats. We have shown that this approach is superior to all known calculi on certain classes of first-order clauses. In particular, it is able to detect satisfiability of clause sets that have only infinite models.

#### Combination of Satisfiability Procedures

Joint work with Christophe Ringeissen from the PESTO project-team of Inria Nancy – Grand Est, and Paula Chocron at IIIA-CSIC, Bellaterra, Catalonia, Spain.

A satisfiability problem is often expressed in a combination of theories, and a natural approach consists in solving the problem by combining the satisfiability procedures available for the component theories. This is the purpose of the combination method introduced by Nelson and Oppen. However, in its initial presentation, the Nelson-Oppen combination method requires the theories to be signature-disjoint and stably infinite (to ensure the existence of an infinite model). The design of a generic combination method for non-disjoint unions of theories is difficult, but it is worth exploring simple non-disjoint combinations that appear frequently in verification. An example is the case of shared sets, where sets are represented by unary predicates. Another example is the case of bridging functions between data structures and a target theory (e.g., a fragment of arithmetic).

In 2015, we defined [article] a sound and complete combination procedure à la Nelson-Oppen for the theory of absolutely free data structures (including lists and trees) connected to another theory via bridging functions. This combination procedure has also been refined for standard interpretations. The resulting theory has a nice politeness property, enabling combinations with arbitrary decidable theories of elements. We also investigated other theories [article] amenable to similar combinations: this class includes the theory of equality, the theory of absolutely free data structures, and all the theories in between.

In 2017, we have been improving the framework and unified both results. A new paper is in preparation.

#### Quantifier Handling in SMT

Joint work with Andrew J. Reynolds, Univ. of Iowa, USA.

SMT solvers generally rely on various instantiation techniques for handling quantifiers. We built a unifying framework encompassing quantified formulas with equality and uninterpreted functions, such that the major instantiation techniques in SMT solving can be cast in that framework. It is based on the problem of E-ground (dis)unification, a variation of the classic Rigid E-unification problem. We introduced a sound and complete calculus to solve this problem in practice: Congruence Closure with Free Variables (CCFV). Experimental evaluations of implementations of CCFV demonstrate notable improvements in the state-of-the-art solver CVC4 and make the solver veriT competitive with state-of-the-art solvers for several benchmark libraries, in particular those originating in verification problems. This was the subject of a publication [article]. In later, unpublished work, we are revisiting enumerative instantiation for SMT. This effort takes place in the context of the Matryoshka project.

#### Non-Linear Arithmetic in SMT

In the context of the SMArT ANR-DFG (Satisfiability Modulo Arithmetic Theories), KANASA and SC ${}^{2}$ projects (cf. sections 59 and 82), we study the theory, design techniques, and implement software to push forward the non-linear arithmetic (NLA) reasoning capabilities in SMT. This year, we designed a framework to combine interval constraint propagation with other decision procedures for NLA, with promising results, notably in the international competition of SMT solvers SMT-COMP 2017. We also studied integration of these procedures into combinations of theories. The ideas are validated within the veriT solver, together with code from the raSAT solver (from JAIST). An article is in preparation.

We also adapted the subtropical method to use in an SMT context, with valuable results. This was the subject of a publication in 2017 [article].

#### Proofs for SMT

We have developed a framework for processing formulas in automatic theorem provers, with generation of detailed proofs. The main components are a generic contextual recursion algorithm and an extensible set of inference rules. Clausification, skolemization, theory-specific simplifications, and expansion of `let’ expressions are instances of this framework. With suitable data structures, proof generation adds only a linear-time overhead, and proofs can be checked in linear time. We implemented the approach in the SMT solver veriT. This allowed us to dramatically simplify the code base while increasing the number of problems for which detailed proofs can be produced, which is important for independent checking and reconstruction in proof assistants. This was the subject of a publication in  [article]. This effort takes place in the context of the Matryoshka project.

#### Coding Modal and Description Logics in SAT solvers

The application scenario behind this research is the verification of graph transformations, which themselves are relevant for a wide range of practical problems such as pointer structures in imperative programs, graph databases or access control mechanisms.

Graph structures can typically be perceived as models of modal logics, and modal logics and variants (such as description logics that are the basis for the web ontology language OWL) are in principle suitable specification formalisms for graph transformations. It turns out, however, that pure modal logics are often not sufficiently expressive for the intended verification purpose and that extensions are needed for which traditional proof methods such as tableau calculi become complex: the termination of the calculi are often very difficult to prove, and huge efforts are required to obtain an efficient implementation.

For these reasons, we have explored methods of encoding the above-mentioned logics in SAT and SMT solvers such as CVC4 and veriT. The idea is to traverse the formula to be verified in order to span up a pre-model that possibly contains more elements (worlds in a Kripke structure) than the real model, and then to run a solver to find out which of these elements can effectively be realized. A prototype has been implemented, with encouraging results. It remains to connect this prototype to the graph verification engine and to publish this work.

#### Work on the TLA+ Proof System

We continued our work on encoding set-theoretic formulas in multi-sorted first-order logic, and in particular for SMT solvers. Specifically, we unified and streamlined a technique combining an injection of unsorted expressions into sorted languages, simplification by rewriting, and abstraction that underlies the SMT backend of the TLA+ proof system TLAPS. A presentation of our technique was accepted in the journal Science of Computer Programming, to appear in 2018.

The proof of the join protocol in a pure-join variant of the Pastry protocol [article] implementing a distributed hash table over a peer-to-peer network is the largest case study carried out so far within TLAPS. Consisting of roughly 30k lines of proof, it was developed as part of Noran Azmy’s PhD thesis, defended at the end of 2016 [article]. A presentation of the design of the protocol and its proof was accepted in the journal Science of Computer Programming, to appear in 2018.

#### Automated Analysis of Systems of ODE for Multistationarity

Joint work with R. Bradford and J. Davenport (Bath, UK), M. England (Coventry, UK), H. Errami, C. Hoyt, and A. Weber (Bonn, Germany), V. Gerdt (Dubna, Russia), D. Grigoriev (Lille, France), O. Radulescu (Montpellier, France)

We considered the problem of determining multiple steady states for positive real values in models of biological networks. Investigating the potential for these in models of the mitogen-activated protein kinases (MAPK) network has consumed considerable effort using special insights into the structure of corresponding models. We have applied combinations of symbolic computation methods for mixed equality/inequality systems, specifically automated deduction methods like virtual substitution, lazy real triangularization and cylindrical algebraic decomposition. We have determined multistationarity of an 11-dimensional MAPK network when numeric values are known for all but potentially one parameter. More precisely, our considered model has 11 equations in 11 variables and 19 parameters, 3 of which are of interest for symbolic treatment, and furthermore positivity conditions on all variables and parameters [article].

Subsequent work [article] demonstrates that our techniques benefit tremendously from a new graph theoretical symbolic preprocessing method. We apply our combined techniques to visualize of parameter regions for multistationarity. Comparing computation times and quality of results it turns out that our automated deduction-based approach clearly outperforms established numerical continuation methods.

While automated deduction technology is a bit under the hood here, this interdisciplinary research line addresses important questions related to contemporary research in systems biology. With researchers from that area very actively involved, the results are recognized also within their communities.

#### Making Explicit Domain Knowledge in Formal System Development

Joint work with partners of the IMPEX project.

As explained in the description of the IMPEX project in section 59, we advocate that formal modeling languages should explicitly represent the knowledge resulting from an analysis of the application domain, and that ontologies are good candidates for handling explicit domain knowledge. Our objective in doing so is to offer rigorous mechanisms for handling domain knowledge in design models.

We developed the notion of dependency for state-based models. Context-awareness is an important feature in system design. We argue that in proof systems and conceptual modelling this notion should be highlighted precisely. Since we focus on conceptual modelling, understandability and clarity are of high importance. We introduce a new definition [article] for proof context in state-based formalisms with an application to the Event-B modeling language. Furthermore, we introduce a dependency relation between two Event-B models. The contextualization of Event-B models is based on knowledge provided from domains that we classified into constraints, hypotheses and dependencies. The dependency mechanism between two models makes it possible to structure the development of systems models, by organizing phases identified in the analyzed process. These ideas are inspired by work based on the modelling of situations in situation theory that emphasize capabilities of type theory with regard to situation modelling to represent knowledge. Our approach is illustrated on small case studies, and was validated on a development of design patterns for voting protocols.

#### Incremental Development of Systems and Algorithms

Joint work with Manamiary Bruno Andriamiarina, Neeraj Kumar Singh (IRIT, Toulouse), Rosemary Monahan (NUI Maynooth, Ireland), Zheng Cheng (LINA, Nantes), and Mohammed Mosbah (LaBRI, Bordeaux).

The development of distributed algorithms and, more generally, of distributed systems, is a complex, delicate, and challenging process. The approach based on refinement applies a design methodology that starts from the most abstract model and leads, in an incremental way, to a distributed solution. The use of a proof assistant gives a formal guarantee on the conformance of each refinement with the model preceding it.

Our main result during 2017 is the development of a proved-based pattern for integrating the local computation models and the Visidia platform [article].

#### Modeling Network Flows in View of Building Security Chains

We are working on the application of formal modeling and verification techniques in the area of network communications, and in particular for constructing security functions in a setting of software-defined networks (SDN). Concretely, Nicolas Schnepf defined an extension of the Pyretic language [article] taking into account both the control and the data planes of SDN controllers and implemented a translation of that extension to the input languages of the nuXmv model checker and of SMT solvers. This work was published at NetSoft 2017 [article].

Extending this approach, we have worked on inferring probabilistic finite-state automata models that represent network flows generated by Android applications. The objective is to exploit this representation for generating security chains that detect significant deviations from the behavior represented by the automata and can initiate protective actions. Comparing our models with automata produced by the state-of-the-art tools Invarimint and Synoptic, we obtain representations that are as succinct as those inferred by Invarimint, and significantly smaller than Synoptic, but that include information about transition probability, which Invarimint does not. This work was accepted for publication at NOMS 2018.

#### Satisfiability Techniques for Reliability Assessment

Joint work with Nicolae Brînzei at Centre de Recherche en Automatique de Nancy.

The reliability of complex systems is typically assessed using probabilistic methods, based on the probabilities of failures of individual components, relying on graphical representations such as fault trees or reliability block diagrams. Mathematically, the dependency of the overall system on the working status of its components is described by its Boolean-valued structure function, and binary decision diagrams (BDDs) have been used to construct a succinct representation of that function. We explore the use of modern satisfiability techniques as an alternative to BDD-based algorithms. In [article], we develop three different algorithms for computing minimal tie sets (i.e., component configurations that ensure that the system is functioning). Our algorithms are based on either conjunctive or disjunctive normal form representations of the structure function or on the Hasse diagram representing the configurations. These algorithms have been prototypically implemented in Python, and we are evaluating them on existing benchmarks in order to understand which algorithm works best for typical fault dependencies.

#### Statistical evaluation of the robustness of production schedules

Joint work with Alexis Aubry, Sara Himmiche, Pascale Marangé, and Jean-François Pétin at Centre de Recherche en Automatique de Nancy.

Finding a good schedule for a production system, especially when it is flexible and when several machines can perform the same operation on products, is a challenging and interesting problem. For a long time, operations research has provided state-of-the-art methods for optimizing scheduling problems. However, approaches based on Discrete Event Systems present interesting alternatives, especially when dealing with uncertainties on the demand or the production time. In this particular case, the flexibility of the automata-based modeling approach is really useful. Using probabilistic timed automata, we demonstrated [article] that statistical model checking can be used successfully for evaluating the robustness of a given schedule w.r.t. probabilistic variations of the processing time. We were thus able to compare different schedules based on their level of service (i.e., the probability that the system will complete the production process within a deadline slightly higher that the schedule time) and their sensitivity (the minimal deadline for which the level of service is greater than a given threshold) [article].

An interdisciplinary workshop on this topic was organized jointly with our colleagues of Centre de Recherche en Automatique and funded by Fédération Charles Hermite.

#### Using Cubicle for Verifying TLA+ Specifications

Cubicle http://cubicle.lri.fr is a model checker for the verification of parameterized transition systems whose state is described by arrays of variables indexed by an abstract sort representing processes. During her internship, Poonam Kumari designed a translation algorithm from a restricted class of TLA+ specifications into the input language of Cubicle. A prototypical implementation demonstrates the feasibility of the approach, although more work will be necessary to widen the scope of the translation. This work will be continued within the PARDI project, described in section 59.