Seminar (2022-2023)

The following talks were given at the Épicure seminar during the academic year 2022-2023:

2023/07/17 Matthieu Lemerre “SSA Translation Is an Abstract Interpretation, and its Application to Machine Code Analysis” abstract
Conversion to to Static Single Assignment (SSA) form is usually viewed as a syntactic transformation algorithm that gives unique names to program variables, and reconciles these names using “phi” functions based on a notion of domination. We instead propose a semantic approach, where SSA translation is performed using a simple dataflow analysis. Based on a new technique to use cyclic terms in abstract domains, we propose a Symbolic Expression abstract domain that performs a Global Value Numbering analysis, upon which we build our SSA translation. This implies a shift in perspective, as global value numbering becomes a prerequisite of SSA translation, instead of depending on SSA. One application to performing SSA Translation by Abstract Interpretation is that SSA optimisations passes can be implemented as a combination of abstract domains, allowing to perform several optimizations simultaneously to solve the usual phase ordering problem and avoiding tedious maintainance of SSA invariants. Our main motivation for this research is an analyzer for machine code which uses SSA as its main intermediate representation. Machine code is too low-level to allow SSA translation without a prior semantic analysis, while SSA is an intermediate representation that makes static analysis easier than direct analysis of machine code. Viewing SSA translation as a semantic analysis solves this chicken-and-egg problem, allowing to simultaneously decompile machine code to SSA and use the SSA representation to perform the other semantic analyses (value analysis, memory analysis, and control-flow analysis). We illustrate the use of such an analysis, combined with a novel type system for machine code where type checking is also done by abstract interpretation, on an embedded OS kernel where we prove security properties directly from its executable.

2023/06/28 Damien Hardy “Ofast3D et un peu plus” abstract
La fabrication additive par dépôt de filament fondu, plus communément appelée impression 3D, a de nombreux champs d’applications. Néanmoins, la production d’une pièce imprimée en 3D peut s’avérer très lente. L’objectif de l’action exploratoire Ofast3D est d’augmenter la capacité de production des utilisateurs d’imprimante 3D par dépôt de filament fondu. Cette action vise à réduire le temps d’impression sans impacter la qualité d’impression en optimisant le code interprété par les imprimantes tout en prenant en compte la géométrie du modèle 3D. Cette présentation portera sur le processus d’impression 3D, le positionnement de Ofast3D dans la chaîne d’outils et présentera quelques problèmes de recherche à discuter.
2023/06/23 Jean-Loup Hatchikian-Houdot “Thwarting Timing Attacks in Microcontrollers using Fine-grained Hardware Protections” abstract
Timing side-channels are an identified threat for security critical software. Existing countermeasures have a cost either on the hardware requirements or execution time. We focus on low-cost microcontrollers that have a very low computational capacity. Although these processors do not feature out-of-order execution or speculation, they remain vulnerable to timing attacks exploiting the varying latencies of ALU operations or memory accesses. We propose to augment the RISC-V ISA with security primitives that have a guaranteed timing behavior. These primitives allow constant time ALU operations and memory accesses that do not alter the state of the cache. Our ap- proach has a low overhead in terms of hardware cost, binary code size, and execution time both for the constant time secure program and other programs running concurrently on the same hardware.
2023/06/22 Théo Losekoot “Automata-based verification of relational properties of functions over algebraic data structures” abstract
Cet exposé est une répétition pour la présentation du papier “Automata-based verification of relational properties of functions over data structures” à FSCD2023. Cette présentation durera 20 ou 25 minutes, en fonction des modalités de la conférence. Il sera sujet d’apprentissage automatique d’automates par généralisation à partir d’exemples et de la vérification des propriétés dans ces automates. On y parlera d’automates d’arbres et de convolution.

2023/06/01 Olivier Sentieys “Opportunities for computer architecture research with open-source hardware – The case of RISC-V” abstract
In this talk, I will motivate why computer architecture research is gaining momentum, and why the RISC-V open Instruction Set Architecture together with open-source hardware are two key enablers for cool research in this domain.
2023/05/25 Vincent Rébiscoul “Construire des Interpréteurs Abstraits à partir de Sémantiques Squelettiques” abstract
Un interpréteur abstrait est un outil permettant d’approximer le comportement de programmes, et de vérifier que ceux-ci respectent certaines propriétés (comme l’absence de certains bugs). Un interpréteur abstrait est un programme souvent complexe et donc difficile à développer. Nous proposons une méthode pour construire des interpréteurs abstraits corrects à partir de sémantiques mécanisées. L’objectif de l’exposé et de présenter cette méthode: en partant d’une description formelle d’un langage (ici une sémantique squelettique), on définit un interpréteur abstrait correct et exécutable.
2023/05/11 Alexandre Moine “A High-Level Separation Logic for Heap Space under Garbage Collection” abstract
A garbage collector greatly simplifies the life of a programmer: it automates the tedious task of deallocating memory. However, the heap space usage of garbage collected programs can be tricky to understand and reason about, as there is no explicit program point where space is reclaimed.We present a Separation Logic with Space credits to reason about heap space for a high-level language equipped with garbage collection. In this work, a central problem is to identify the memory locations that the garbage collector considers as roots. For this purpose, we propose a new assertion that keeps track of the existence of stack-to-heap pointers in a reasonably lightweight manner. In addition, we explain how to reason about function closures, which involve heap-allocated blocks that may hold pointers onto other blocks. Finally, I will present how we scale up to a concurrent language. In particular, I will focus on lock-free data structures and how we unveil their surprisingly high heap memory bounds in the presence of garbage collection.

2023/05/04 Christine Rizkallah (University of Melbourne) “A Language Approach for Facilitating Software Verification” abstract
Software verification is essential for building secure systems. Successful projects such as the verified seL4 microkernel and the verified CompCert C compiler have pushed the boundary of what is deemed feasible in our field. However, many critical software components still remain unverified. Verifying such components using the existing brute-force techniques would be infeasible. Instead, my research vision involves creating methods to reduce the cost of verification without compromising on efficiency or trust.In this talk, I will present some previous work that achieves this vision within several different areas in computer science. For instance, I will demonstrate how we used functional languages, uniqueness type systems, and certifying compiler techniques through the Cogent language to dramatically reduce the cost of verifying efficient filesystems. Furthermore, I will summarise how we used a data layout specification language called Dargent to ease verifying device drivers.
2023/04/06 Enzo Crance “Towards stratified parametricity for CCω” abstract
Proof automation in Coq notoriously suffers from the expressivity of its dependently-typed programming language, especially the possibility to represent a same mathematical object with several data structures. In this talk, we will explore state-of-the-art techniques exploiting relations between types to derive information about more complex terms, and we will present 2 pre-processing tools exploiting this kind of knowledge to rewrite goals. The first one, Trakt, extends the input domains of SMT-like automation tactics thanks to embeddings between types. The second one, more general but still in progress, is based on a new flavour of parametricity called stratified parametricity, building on recent work inside the Gallinette team.
2023/03/30 Frédéric Bour “Engineering error messages for LR parsers” abstract
LR parsing is a powerful technology that generates efficient parsers from a declarative specification and guarantees the absence of ambiguities. However, the parsers produced fall short when provided with incorrect input: generally, only the location of the erroneous construct can be reported. LRgrep aims to enhance LR(1) parsers with good error messages. It takes the form of a DSL that complements the grammar with a specification of error messages. Tools can check that all errors are covered and help to keep the error messages relevant when the grammar evolves. We demonstrate it on the OCaml grammar.

2023/03/16 Yannick Zakowski “Monadic Definitional Interpreters as Formal Semantic Models of Computations in Coq” abstract
Monadic interpreters have seen in recent years a rise in popularity as a means to model and reason about impure computations in dependently typed theories. The Interaction Tree project in particular implements a set of libraries in the Coq proof assistant providing tools to enable this approach at scale.This talk intents to paint an overview of the project. I will first introduce the core concepts: modelling divergence through coinduction, achieving modularity through layered interpreters in the style of algebraic effects, enabling testing by extraction, and offering equational and Hoare-style relational reasoning up-to a notion of weak bisimilarity. Projects to which the approach have been applied, notably in the modelling of LLVM IR as part of the Vellvm project, will be mentioned on the way. Through the second half of the presentation, I will focus more specifically on so-called Choice Trees, a recent proposal to provide in this ecosystem a proper support for non-deterministic computations.

2023/03/09 Basile Clément “Prophetic Annotations: A Different Take on Translation-Validation” abstract
Tensor compilers transform high-level mathematical specifications on (mostly) pure multidimensional arrays, called tensors, down to low-level imperative code with mutable arrays and loops, tailored to specific hardware. These compilers are mostly out of reach of traditional translation-validation techniques due to the structure-modifying transformations they perform.In this talk, I will present the results of my PhD research on a translation-validation technique for tensor compilers that is centered around the idea of _prophetic annotations_ : lightweight compiler-generated annotations inspired by the polyhedral model that do not encode transformations but rather link imperative array accesses to the original pure value the compiler believes they represent in the original specification.After a quick presentation of tensor compilers and the motivations for their use, I will present the techniques developed in my thesis for the validation of a subset of the transformations performed by these compilers that include both affine transformations and local algebraic transformations. I will show experimental results of the approach on the industrial-grade Halide compiler, and briefly present unfinished extensions that would allow the approach to handle a larger class of programs and transformations, including most common uses of non-affine programs and transformations as well as handling of commutative-associative reductions. Finally, I will present some thoughts on the way this work separates affine from non-affine annotations, and the potential uses of this approach to provide lightweight specifications for the verification of numerical programs.
2023/03/02 Simon Castellan, Action Exploratoire Inria “Back to the trees”
2023/02/06 Son Ho, “Aeneas: Rust Verification by Functional Translation” abstract
We present Aeneas, a new verification toolchain for Rust programs based on a lightweight functional translation. We leverage Rust’s rich region-based type system to eliminate memory reasoning for a large class of Rust programs by translating them to a pure lambda-calculus, as long as they do not rely on interior mutability or unsafe code. Doing so, we relieve the proof engineer of the burden of memory-based reasoning, allowing them to instead focus on functional properties of their code.

2023/01/26 Vincent Rébiscoul, “Analyses Statiques pour Sémantiques Squelettiques” abstract
Dans cette présentation, nous décrivons une méthode pour définir une interprétation abstraite correcte à partir d’une description formelle de sémantique d’un langage de programmation: une Sémantique Squelettique. Nous ajoutons une notion de points de programme aux sémantiques squelettiques pour différencier des morceaux de programme syntaxiquement équivalents mais qui se trouve à des endroits différents. Nous proposons une méthodologie pour dériver une interprétation abstraite à partir d’une sémantique squelettique qui est correcte par construction. Nous appliquons notre méthode pour définir une analyse d’intervalle pour un petit language impératif.
2023/01/12 Pierre Lermusiaux, “Analyse statique de transformations pour l’élimination de motifs” abstract
Thèse effectuée au Loria sous la direction de Pierre-Étienne Moreau et Horatiu Cirstea.Ma thèse propose une étude formelle des procédures de transformation de programmes dans le but d’exprimer et de garantir des propriétés syntaxiques sur le comportement et les résultats d’une telle transformation. Dans le contexte de la vérification formelle des programmes, il est en effet souvent nécessaire de pouvoir caractériser la forme des termes obtenus par réduction suivant une transformation considérée. Dans le contexte de langages typés, le système de type peut permettre de spécifier et de vérifier certaines caractéristiques sur la forme de ces termes, en explicitant notament leur symboles constructeurs. Il est cependant souvent intéressant de pouvoir spécifier l’absence de certains symboles, ou de certains motifs.J’introduit, dans ma thèse, un formalisme, basé sur les notions de filtrage par motif et de réécriture, permettant de décrire de telles propriétés. Le formalisme proposé repose sur un système d’annotations des symboles de fonction décrivant une spécification du comportement attendu des fonctions associées. On présente alors une méthode d’analyse statique permettant de vérifier que les transformations étudiées, exprimées par un système de réécriture, satisfont en effet ces spécifications.
2022/12/14 Frédéric Besson, “An eBPF backend for CompCert” 2022/11/28 Santiago Bautista, “Lifting Numeric Relational Domains to Algebraic Data Types” 2022/11/09 Frédéric Besson, “An extension of CompCert that allows to call binary code” 2022/10/26 Léo Gourdin, “Formally verified block optimizations by symbolic execution modulo invariants” abstract
CompCert (ACM Software System Award 2021) is the first industrial-strength compiler with a mechanically checked proof of correctness. But CompCert remains a moderately optimizing C compiler. Indeed, some optimizations of “gcc -O1” like Lazy Code Motion (LCM) or Strength Reduction (SR) are still missing: developing these efficient optimizations together with their formal proofs remains challenging. We already combined formally verified translation validators for certifying the results of an efficient superblock scheduler with peephole optimizations. We now revisit and generalize this approach into a framework (integrated in CompCert) able to validate many more optimizations: an enhanced superblock scheduler, but also Dead Code Elimination (DCE), Constant Propagation (CP), and more noticeably, LCM and SR global transformations. Toward this goal, we introduce a new intermediate representation (IR) named BTL, which syntactically represents blocks of code. Then, we propose a block-based symbolic execution framework capable of aggregating “local” simulations at the global level, making sure that the changes are consistent between blocks.

2022/09/28 Thomas Genet, “Comprendre Ethereum/Solidity par la pratique”
2022/09/21 Roméo La Spina (Inria Épicure), “Un analyseur flots de données pour le compilateur formellement vérifié CompCert” abstract
Un compilateur vérifié est un compilateur accompagné d’une preuve qui montre que ce dernier n’introduit pas de bogue dans les programmes. Une référence du domaine est le compilateur CompCert du langage C, entièrement écrit et prouvé à l’aide de l’assistant de preuve Coq. CompCert est un compilateur non seulement vérifié, mais aussi optimisant ; ses performances en termes de temps d’exécution d’un programme compilé sont comparables à celles obtenues avec l’option `-O1` du compilateur de référence GCC (non vérifié).Plusieurs optimisations de CompCert utilisent une analyse flot de données, qui résout un ensemble d’équations grâce à un algorithme dû à Kildall. Cependant, selon la taille du treillis utilisé pour faire l’analyse, la convergence de l’algorithme peut nécessiter un temps extrêmement long. Un autre algorithme, développé par Bourdoncle, qui paramètre l’analyse par un ordre d’itération calculé au préalable, permet l’utilisation efficace d’opérateurs d’élargissement, solution classique pour pallier ce problème dans le domaine de l’interprétation abstraite. Nous avons donc choisi d’implémenter l’algorithme de Bourdoncle dans CompCert et de comparer ses performances à celles de l’algorithme de Kildall, à la fois en termes d’efficacité et en termes de précision de l’analyse.