Research


Overall objectives

The general objective of the Toccata project is to promote formal specification and computer-assisted proof in the development of software that requires high assurance in terms of safety and correctness with respect to its intended behavior. Such safety-critical software appears in many application domains like transportation (e.g. aviation, aerospace, railway, automotive), communication (e.g. internet, smartphones), health devices, data management on clouds (confidentialty issues), etc. The number of tasks performed by software is quickly increasing, together with the number of lines of code involved. Given the need of high assurance of safety in the functional behavior of such applications, the need for automated (in the sense computer-assisted) methods and techniques to bring guarantee of safety became a major challenge. In the past and at present, the most widely used approach to check safety of software is to apply heavy test campaigns, which take a large part of the costs of software development. Yet these campaigns cannot ensure that all the bugs are caught, and remaining bugs may have catastrophic consequences.

Generally speaking, software verification approaches pursue three goals: (1) verification should be sound, in the sense that no bugs should be missed, (2) verification should not produce false alarms, or as few as possible, (3) it should be as automatic as possible. Reaching all three goals at the same time is a challenge. A large class of approaches emphasizes goals (2) and (3): testing, run-time verification, symbolic execution, model checking, etc. Static analysis, such as abstract interpretation, emphasizes goals (1) and (3). Deductive verification emphasizes (1) and (2). The Toccata project is mainly interested in exploring the deductive verification approach, although we also combine with the other techniques occasionally.

In the past decade, significant progress has been made in the domain of deductive program verification. This is emphasized by some success stories of application of these techniques on industrial-scale software. For example, the Atelier B system was used to develop part of the embedded software of the Paris metro line 14 44 and other railway-related systems; a formally proved C compiler was developed using the Coq proof assistant 71; the L4-verified project developed a formally verified micro-kernel with high security guarantees, using analysis tools on top of the Isabelle/HOL proof assistant 70. A bug in the JDK implementation of TimSort was discovered using the KeY environment 67 and a fixed version was proved sound. Another sign of recent progress is the emergence of deductive verification competitions (e.g. VerifyThis 45). Finally, recent trends in the industrial practice for development of critical software is to require more and more guarantees of safety, e.g. the DO-178C standard for developing avionics software adds to the former DO-178B the use of formal models and formal methods. It also emphasizes the need for certification of the analysis tools involved in the process.

Last activity report : 2024

Results

New results

Foundations and Spreading of Deductive Program Verification

Improving Verification Condition Generation

Filliâtre, Paskevich and Patault introduce Coma, a formally defined intermediate verification language. Specification annotations in Coma take the form of assertions mixed with the executable program code. A special programming construct representing the abstraction barrier is used to separate, inside a subroutine, the “interface” part of the code, which is verified at every call site, from the “implementatio” part, which is verified only once, at the definition site. In comparison with traditional contract-based specification, this offers the user an additional degree of freedom, as they can provide separate specification (or none at all) for different execution paths. They define a verification condition generator for Coma and prove its correctness. For programs where specification is given in a traditional way, with abstraction barriers at the function entries and exits, the verification conditions are similar to the ones produced by a classical weakest-precondition calculus. For programs where abstraction barriers are placed in the middle of a function definition, the user-written specification is seamlessly completed with the verification conditions generated for the exposed part of the code. In addition, their procedure can factorize selected subgoals on the fly, which leads to more compact verification conditions. They illustrate the use of Coma on two non-trivial examples, which have been formalized and verified using their implementation: a second-order regular expression engine and a sorting algorithm written in unstructured assembly code. An article presenting this work is accepted for presenting at the ESOP Conference in 2025 34.

Inference of Invariants

The automatic discovery of invariants is an important topic to increase the automation of deductive verification. A fully automatic generation of invariants was studied in collaboration with an industrial partner: Cousineau and Marché 20 devised an original abstract interpretation based approach using a domain of parametrized binary decision diagrams. This includes an application to the verification of Ladder programs, discussed below in axis 4.

Formal Specification Language for C code

ACSL, short for ANSI/ISO C Specification Language 29, is meant to express precisely and unambiguously the expected behavior of a piece of C code. It plays a central role in Frama-C, as nearly all plug-ins eventually manipulate ACSL specifications, either to generate properties that are to be verified, or to assess that the code is conforming to these specifications. It is thus very important to have a clear view of ACSL’s semantics in order to be sure that what you check with Frama-C is really what you mean. Marché contributed to a chapter 27 of the Frama-C book, describing the language in an agnostic way, independently of the various verification plug-ins that are implemented in the Frama-C platform. It contains many examples and exercises that introduce the main features of the language and insists on the most common pitfalls that users, even experienced ones, may encounter.

Automated Reasoning on Inductive Predicates

Filliâtre, Paskevich and Saudubray 24, 22 worked on an extension of the Why3 tool to allow proofs by induction on instances of inductive predicates, i.e. on finitely constructed derivations. It consists of a new matching construction to analyze the form of such a derivation, on the one hand, and a new notion of variant to justify the termination of a recursive function that proceeds according to the size of a derivation, on the other hand. They show how this extension can be implemented conservatively, with almost nothing changed in the verification condition generator of Why3. They illustrate the robustness of this contribution by translating from Coq to Why3 a non-trivial proof containing a large number of reasonings by induction on inductive predicates.

Cross-Language Symbolic Execution

One key problem in the field of software security is to expose software vulnerabilities effectively. Traditional testing methods have shown inadequacy in terms of path coverage and bug detection, due to their reliance on concrete input values. Hui and Andrès 14 proposed the idea of combining symbolic execution with runtime annotation checking. By using symbolic input values instead of concrete ones, the specified program properties can be checked on all the corner cases. They introduce two implementations of symbolic runtime annotation checkers: one for C, using the ANSI/ISO C Specification Language, and one for WebAssembly, using the Weasel specification language designed by them. Their approach combines the ease of use and the expressiveness of runtime annotation checking, as well as the power of program analysis.

Reasoning on the Amortized Time Complexity

Ownership can also be used to reason about resources other than program memory. Guéneau, Jourdan et al. 17 present formal reasoning rules for verifying amortized complexity bounds in a language with thunks. Thunks can be used to construct persistent data structures with good amortized complexity, by suspending expensive computations and memoizing their result. Based on the notion of time credits and debits, this work presents a complete machine-checked reconstruction of Okasaki’s reasoning rules on thunks in a rich separation logic with time credits, and demonstrates their applicability by verifying several of Okasaki’s data structures.

Peano Arithmetic and μ MALL

Formal theories of arithmetic have traditionally been based on either classical or intuitionistic logic, leading to the development of Peano and Heyting arithmetic, respectively. In a collaboration with D. Miller, Manighetti 33 propose to use μ MALL as a formal theory of arithmetic based on linear logic. This formal system is presented as a sequent calculus proof system that extends the standard proof system for multiplicative-additive linear logic (MALL) with the addition of the logical connectives universal and existential quantifiers (first-order quantifiers), term equality and non-equality, and the least and greatest fixed point operators. They demonstrate how functions defined using μ MALL relational specifications can be computed using a simple proof search algorithm. By incorporating weakening and contraction into μ MALL, they obtain μ LK+, a natural candidate for a classical sequent calculus for arithmetic. While important proof theory results are still lacking for μ LK+ (including cut-elimination and the completeness of focusing), they prove that μLK+ is consistent and that it contains Peano arithmetic. They also prove some conservativity results regarding μLK+ over μ MALL.

Reasoning on mutable memory in program verification

Verification of Rust programs

One of the major success of Toccata during the last years is represented by the results obtained concerning the verification of Rust programs. Rust is a fairly recent programming language for system programming, bringing static guarantees of memory safety through a strong ownership policy. This feature opens promising advances for deductive verification of Rust code. The project underlying the PhD thesis of Denis 60, supervised by Jourdan and Marché, is to propose techniques for the verification of Rust program, using a translation to a purely-functional language. The challenge of this translation is the handling of mutable borrows: pointers which control of aliasing in a region of memory. To overcome this, we used a technique inspired by prophecy variables to predict the final values of borrows 61. This method is implemented in a standalone tool called Creusot 62. The specification language of Creusot features the notion of prophecy mentioned above, which is central for the specification of behavior of programs performing memory mutation. However, so far, this prophecy-based encoding has only been described in the idealized setting of a core calculus.

Recently, Golfouse, Jourdan and Guéneau, in collaboration with Xavier Denis and Dominic Stolz, show how one might “scale” this technique up to the setting of a realistic verification tool. After describing why doing so is non-trivial, we show how to integrate this encoding with two common features of deductive verification systems: ghost code and type invariants. Additionally, we provide concrete implementation strategies for key aspects of the encoding that were unspecified but turn out to be crucial when considering realistic programs. We implemented this work as an extension of Creusot and present it in a draft paper, which we plan to submit for publication in 2025.

Ownership and Well-Bracketedness

Ability to reason about ownership is also fertile ground for designing reasoning principles that capture powerful semantic properties of programs. In particular, Guéneau et al. 18 show that it is possible to capture well-bracketedness in a Hoare-style program logic based on separation logic, providing proof rules to show correctness of well-bracketed programs both directly and also through defining unary and binary logical relations models based on this program logic.

Multi-language Verification

Guéneau continued work on the Melocoton program multi-language verification framework 68, together with master interns Gurvan Debaussart and Valeran Maytie. They extended Melocoton and its Coq mechanization with new semantics and reasoning rules for the interactions between OCaml exceptions and the OCaml FFI, and for so-called “GC roots” provided by the OCaml FFI.

Inference of Specifications for Closures

In many programs, closures allow to express data transformations in a concise way. But when a verification tool is used, they must be accompanied by specifications that are often longer than their body. This is a particularly unpleasant problem, since these closures are often simple and their specification is redundant. Patault, Golfouse and Denis 16 presented a mechanism for inferring the specification of closures for the formal verification of Rust programs. They propose the use of the intermediate verification language Coma as a backend for the deductive verification tool Creusot. Their design is able to manage the internal mutable state of a closure and to infer its specification. They use this mechanism to verify in an ergonomic and modular way a series of Rust programs using higher-order functions.

Safe Libraries for Computer Algebra

Low-level libraries used in computer algebra systems, such as GMP, BLAS/LAPACK, etc., are usually written in C, Fortran, and Assembly, and make heavy use of arrays and pointers. Melquiond and Moreau 15, 21 have designed Capla, a programming language dedicated to writing such libraries. This language, halfway between C and Rust, is designed to be safe and to ease the deductive verification of programs, while being low-level enough to be suitable for this kind of computationally intensive applications. They have also written a compiler for this language, based on CompCert. The safety of the language has been formally proved using the Coq proof assistant, and so has the property of semantics preservation for the compiler.

Verification of Computer Arithmetic

Reasoning about Floating-Point Programs

Numerical programs make use of the floating-point representation of numbers to perform computations that ideally should be done on mathematical real numbers. The floating-point representation induces approximations on the computations ultimately performed. We have a long tradition of study of subtle algorithms involving such numerical computations. A set of numerical programs that we studied this year is related to the combination of exponential and logarithm functions, that is, the log-sum-exp function known in the context of Machine Learning 73, for which Bonnot et al. 12 provide certified bounds on its accuracy.

Boyer, Faissole, and Melquiond 35 have patented a method and system to transform a program, which includes mathematical functions applied to floating-point variables (and thus suffers from numerical approximations and rounding errors), in order to achieve a specified global accuracy.

Bonnot, Boyer, Faissole, and Marché 31 propose to bound the error between the computed result and the ideal computation by a formula involving the assumed precision of the input of the programs, together with the accuracy properties of the auxiliary mathematical functions that are called as basic blocks. Obtaining such an accuracy property needs a high expertise in floating-point computer arithmetic. The proposed methodology is able to automatically generate such form of accuracy formulas from a given input program. Moreover the generated formulas are certified correct with a high level of confidence, thanks to the automated construction of formal proofs of their validity. The methodology is implemented and experimented on several examples involving approximations of elementary functions such as sine, cosine, exponential and logarithm.

Writing a formal proof offers the highest possible confidence in the correctness of a mathematical library. This comes at a large cost though, since formal proofs require taking into account all the details, even the seemingly insignificant ones, which makes them tedious to write. Faissole, Geneau and Melquiond 13, 19 propose a methodology and some dedicated automation, and apply them to the use case of a faithful binary64 approximation of exponential. The peculiarity of this use case is that the target of the formal verification is not a simple modeling of an external code, it is an actual floating-point function defined in the logic of the Coq proof assistant, which is thus usable inside proofs once its correctness has been fully verified. This function presents all the attributes of a state-of-the-art implementation: bit-level manipulations, large tables of constants, obscure floating-point transformations, exceptional values, etc. This function has been integrated into the proof strategies of the CoqInterval library, bringing a 20x speedup with respect to the previous implementation.

Formalization of Mathematics for Numerical Analysis

The correctness of programs solving partial differential equations may rely on mathematics yet unformalized, such as Sobolev spaces. Boldo et al. 46 therefore formalized the mathematical concept of Lebesgue integration and the associated results in Coq ( σ -algebras, measures, simple functions, and integration of non-negative measurable functions, up to the full formal proofs of the Beppo Levi Theorem and Fatou’s Lemma). Boldo et al. 50, 49 extended this formalization with Tonelli’s theorem, stating that the (double) integral of a nonnegative measurable function of two variables can be computed by iterated integrals, and allowing to switch the order of integration.

In her PhD thesis 28 defended in 2024, Houda Mouhcine presented results about Lagrange polynomials on simplicial finite elements, and their fundamental property called unisolvence. All the developments are distributed as part of the Coq-Num-Analysis library.

Spreading Formal Proofs

Compiling OCaml to WebAssembly

As part of his CIFRE PhD with OCamlPro, Andrès 38 formalizes a compilation scheme from OCaml to WebAssembly. This on-going work already validated several Wasm extensions 37. A by-product of the thesis is the implementation of a new, efficient interpreter for Wasm, owi. Filliâtre and Andrès collaborate with José Fragoso Santos and Filipe Marques (Universidade de Lisboa, Portugal), who are using owi for concolic execution of WebAssembly programs  11. Owi is built around a modular, monadic interpreter capable of both normal and symbolic execution of Wasm programs. Monads have been identified as a way to write modular interpreters since 1995 and this strategy has allowed us to build a robust and performant symbolic execution tool which our evaluation shows to be the best currently available for Wasm. Moreover, because WebAssembly is a compilation target for multiple languages (such as Rust and C), Owi can be used to find bugs in C and Rust code, as well as in codebases mixing the two. They demonstrate this flexibility through illustrative examples and evaluate its scalability via comprehensive experiments using the 2024 Test-Comp benchmarks. Results show that Owi achieves comparable performance to state-of-the-art tools like KLEE and Symbiotic, and exhibits advantages in specific scenarios where KLEE’s approximations could lead to false negatives.

Programmable Logic Controllers

Programmable Logic Controllers are industrial digital computers used as automation controllers in manufacturing processes. The Ladder language is a programming language used to develop software for such controllers. A long-term collaboration with MERCE as for goal the verification that a given Ladder program conforms to an expected behaviour expressed by a timing chart, describing a scenario of execution. This method relies on a modelling of Ladder programs in WhyML, the language of the Why3 environment for deductive program verification. In this approach, the WhyML modelling of individual Ladder instructions has to be trusted. Recently, Cousineau et al.  32 propose a methodology to increase the trust in the WhyML modelling of Ladder instructions. The approach relies on a comparison of the execution of Ladder programs with an execution by Why3 of a simulation of the translated program. With this technique, they have been able to validate their modelling of Ladder instructions, and also discover and fix a subtle bug in the modelling of one particular instruction.

Purely Functional Catenable Deques, Formally Verified

Twenty-five years ago, Kaplan and Tarjan 69 established a striking result: there exist “purely functional, real-time deques with catenation”. In this on-going work, Guéneau, in collaboration with Jules Viennot, Arthur Wendling and François Pottier, present the first implementation of Kaplan and Tarjan’s catenable deques. This implementation is expressed in the purely functional subset of the OCaml language. Furthermore, they present the first verified implementation of Kaplan and Tarjan’s catenable deques. This implementation is expressed in Gallina, the programming language of the Coq proof assistant, and its correctness is stated and verified within Coq. Source code and a journal paper for this contribution are being finalized and will be submitted for publication at the beginning of 2025.

Coq as a Computer Algebra System

User interfaces for the Coq proof assistant focus on the ability to write and verify proofs, and mostly ignore Coq’s ability to compute. Melquiond 26 shows how the tactic-in-term feature and some adhoc vernacular commands can provide a user experience closer to the one found in computer algebra systems. This work has been implemented in the CoqInterval library.

Teaching Mathematics with Coq

As part of the LiberAbaci défi, we have worked on how to use Coq for teaching mathematics.

M. Mayero et al. have been teaching an 18 hour introductory course in formal proofs to L1 students for 3 years at “Sorbonne Paris Nord” University. We present the used methodology and some of the encountered pitfalls, providing both technical and pedagogical aspects 25.

We also worked on worksheets in Coq for students to learn about divisibility and binomials. These basic topics are a good case study as they are widely taught in the early academic years (or before in France). Boldo et al. 23, 30 present technical and pedagogical choices, together with the numerous exercises they developed. As expected, it requires additional Coq material such as other lemmas and dedicated tactics. The worksheets are freely available and flexible in several ways.