Seminar – CELTIQUE

Seminars related to Celtique’s research themes:

Celtique Seminar – internal only
68NQRT – http://68nqrt.irisa.fr/
Sosysec – http://seminaire-dga.gforge.inria.fr/

Past sessions of the Celtique Seminar:

2022/06/09: Zaynah Dargaye (Nomadic Labs), “Towards the Conciliation of Formal Methods and the Tezos Development Cycle” abstract

Tezos aims at being both: i) a cutting-edge technology, with a high shipping frequency of fancy and state-of-the-art features and ii) a high-quality technology, with critical data management and deployment in an adversarial context. These two contradictory needs require having a development process that includes a very specific and unusual form of formal methods. Implementing such a process requires incremental inclusion at each stage of the development cycle. In this talk, I will try to illustrate via two development examples how we try to achieve such a process at Nomadic Labs.
2022/05/25: Shenghao Huan (TEA, Inria), “End-to-end verification of the rBPF runtime” abstract

RIOT is a micro-kernel dedicated to IoT applications that adopts eBPF (extended Berkeley Packet Filters) to implement so-called femto-containers: as micro-controllers rarely feature hardware memory protection, the isolation of eBPF virtual machines (VM) is critical to ensure system integrity against potentially malicious programs. We propose a methodology to directly derive the verified C implementation of an eBPF virtual machine from a Gallina specification within the Coq proof assistant. Leveraging the formal semantics of the CompCert C compiler, we obtain an end-to-end theorem stating that the C code of our VM inherits its safety and security properties of its Gallina specification. Our refinement methodology ensures that the isolation property of the specification holds in the verified C implementation. Preliminary experiments demonstrate satisfying performance.
2022/05/18: Vincent Laporte (Pesto, Inria) “Verified Security Against Spectre” abstract

Speculative execution performed by many microprocessors may induce security vulnerabilities (information leakage). Some of those vulnerabilities will not be addressed at the hardware level and require programmers (or compilers) to deploy counter-measures in software. Speculative load hardening (SLH) is such a counter-measure, targetting the Spectre-PHT vulnerability. This talk will show how to automatically verify, at assembly level, that SLH is correctly implemented and formally prove (in Coq) the security provided by SLH.
2022/05/11: Byron Hawkins (PACAP, Inria), “Projet Introspicion” abstract

Le projet Introspicion, issu de l’équipe PACAP chez Inria Rennes, a pour objectif la détection des anomalies d’exécution logicielle qui sont susceptibles d’indiquer une menace. Dans une première phase d’analyse statique du binaire (sans symboles), un profil du comportement normal est composé en forme d’une grammaire de flot de contrôle, enrichi avec des formules multivariables pour représenter les valeurs de données possibles. Dans la deuxième phase de surveillance, la protection d’un logiciel en service est effectuée sous un système de détection d’anomalie, muni du profil de comportement normal pour pouvoir identifier des anomalies. Les variables sont résolues progressivement au fil de l’exécution pour minimiser le temps d’évaluation, tout en préservant la précision d’une analyse intégrale du chemin.
Le profil de comportement normal est basé sur l’identification des dépendances et des distinctions entre les éléments du code machine : les registres, les adresses mémoires et les valeurs mémoires. Pour éviter des calculs interminables, les formules sont organisées de façon partitionnée, avec une variable pour chaque cas où la dépendance ou la distinction entre deux éléments n’est pas certaine en contexte statique. Optimisation de l’évaluation dynamique est réalisée de façon JIT en ajoutant sur chaque élément de la grammaire les clauses des formules pertinentes. Sous surveillance dynamique, une clause est résolue au moment où le chemin d’exécution passe à travers un élément de grammaire à laquelle la clause est liée.
Les deux analyses, statique pour monter le profil et dynamique pour évaluer l’exécution, s’appuient sur un algorithme de parcours en profondeur incrémentiel qui permet la détection des cycles et des autres relations intéressantes dans la sémantique du code machine. Normalement ce genre d’algorithme est spécialisé soit pour les graphes d’arcs denses soit pour les graphes d’arcs épars. Comme ce domaine est susceptible de produire des graphes avec des sous-arbres fortement divers en densité, un nouvel algorithme est élaboré dans le projet qui permet un hybride entre le parcours classique (“DFS clock”) et un parcours relatif avec des arcs classifiés. Pour soutenir l’évaluation rapide des variables, cet algorithme permet l’ajout d’un sommet en complexité constante (O(1)), quelle qu’en soit la taille du graphe.
2022/05/04: Arthur Azevedo de Amorim (Boston University), “On Incorrectness Logic and Kleene Algebra with Top and Tests” abstract

Kleene algebra with tests (KAT) is a foundational equational framework for reasoning about programs, which has found applications in program transformations, networking and compiler optimizations, among many other areas. In his seminal work, Kozen proved that KAT subsumes propositional Hoare logic, showing that one can reason about the (partial) correctness of while programs by means of the equational theory of KAT. In this work, we investigate the support that KAT provides for reasoning about incorrectness, instead, as embodied by Ohearn’s recently proposed incorrectness logic. We show that KAT cannot directly express incorrectness logic. The main reason for this limitation can be traced to the fact that KAT cannot express explicitly the notion of codomain, which is essential to express incorrectness triples. To address this issue, we study Kleene Algebra with Top and Tests (TopKAT), an extension of KAT with a top element. We show that TopKAT is powerful enough to express a codomain operation, to express incorrectness triples, and to prove all the rules of incorrectness logic sound. This shows that one can reason about the incorrectness of while-like programs by means of the equational theory of TopKAT.
2022/04/27: Jean-Loup Hatchikian-Houdot, “Software-Hardware Contract to Mitigate the Cost of Cryptographic Constant Time”
2022/04/06: Denis Mérigoux (Prosecco, Inria) “Catala, a programming language for the law” abstract

Law at large underpins modern society, codifying and governing many aspects of citizens’ daily lives. Oftentimes, law is subject to interpretation, debate and challenges throughout various courts and jurisdictions. But in some other areas, law leaves little room for interpretation, and essentially aims to rigorously describe a computation, a decision procedure or, simply said, an algorithm. Unfortunately, prose remains a woefully inadequate tool for the job. The lack of formalism leaves room for ambiguities; the structure of legal statutes, with many paragraphs and sub-sections spread across multiple pages, makes it hard to compute the intended outcome of the algorithm underlying a given text; and, as with any other piece of poorly-specified critical software, the use of informal language leaves corner cases unaddressed. We introduce Catala, a new programming language that we specifically designed to allow a straightforward and systematic translation of statutory law into an executable implementation. Catala aims to bring together lawyers and programmers through a shared medium, which together they can understand, edit and evolve, bridging a gap that often results in dramatically incorrect implementations of the law. We have implemented a compiler for Catala, and have proven the correctness of its core compilation steps using the F* proof assistant. We evaluate Catala on several legal texts that are algorithms in disguise, notably section 121 of the US federal income tax and the byzantine French family benefits; in doing so, we uncover a bug in the official implementation. We observe as a consequence of the formalization process that using Catala enables rich interactions between lawyers and programmers, leading to a greater understanding of the original legislative intent, while producing a correct-by-construction executable specification reusable by the greater software ecosystem.
2022/03/30: Pierre-Évariste Dagand (CNRS, IRIF), “Dogfooding and Coq: program logics for monadic programs” abstract

Monadic programming is an essential component in the toolbox of functional programmers. For the pure and total programmers, who sometimes navigate the waters of certified programming in type theory, it is the only means to concisely implement the imperative traits of certain algorithms. Monads open up a portal to the imperative world, all that from the comfort of the functional world. The trend towards certified programming within type theory begs the question of reasoning about such programs. Effectful programs being encoded as pure programs in the host type theory, we can readily manipulate these objects through their encoding.
In this talk, we pursue the idea, popularized by Kenji Maillard and his colleagues, that every monad deserves a dedicated program logic and that, consequently, a proof over a monadic program ought to take place within a Floyd-Hoare logic built for the occasion. In the process, we will try to turn the most potent Coq library for reasoning about imperative programs (Iris) to bear against our very own monadic programs in Coq (with applications to CompCert and binary parsing).
2022/03/23: Simon Castellan, “Tout ce que vous avez toujours voulu savoir sur les monades, sans jamais oser le demander (2/2)”
2022/03/16: Simon Castellan, “Tout ce que vous avez toujours voulu savoir sur les monades, sans jamais oser le demander (1/2)” abstract

Dans cet exposé, je présenterai un peu de la théorie des monades, ces organes qui permettent aux effets de se reproduire dans un programme. Les monades sont une structure algébrique qui permet de combiner la composition séquentielle des programmes à effets `e1; e2` avec le passage de valeur `let x = M in N`. L’intérêt des monades est de permettre la représentation de programmes avec effets dans des langages fonctionnels purs et donc de raisonner équationnellement (et localement) sur eux.
2022/03/09: Luca Olivieri (University of Verona), “Analysis of Smart Contracts written in General Purpose Languages” abstract

Smart contracts are programs that allow peers to deploy and execute code on a blockchain network in an immutable, distributed, and decentralized way. Although some domain-specific languages (DSLs) have been proposed for writing smart contracts (such as Solidity and Vyper for the Ethereum blockchain), the choice of adopting general-purpose ones, such as JavaScript, C++, and Go, is widespread in industrial solutions such as HyperLedger Fabric, Cosmos SDK and Ignite. However, general-purpose languages support several features that might introduce further complexity and vulnerabilities in smart contracts. The seminar will make a gentle overview of the main issues relating to them and the analysis techniques adopted to detect them by the static analyzer GoLiSA.
2022/03/02: Léo Gourdin (Verimag/TIMA, Grenoble), “Optimisations intrablocs formellement vérifiées par exécution symbolique” abstract

Un défi important pour les applications systèmes critiques en matière de sécurité consiste à éliminer les bogues introduits par la compilation, qui se trouvent souvent dans les passes d’optimisation des compilateurs classiques (non vérifiés). Leurs alternatives formellement vérifiées offrent un plus grand niveau de confiance dans le code généré, mais manquent généralement d’optimisations efficaces. Les processeurs “en ordre” exécutent les instructions assembleur dans leur ordre syntaxique, et sont courants dans les systèmes critiques, mais les compilateurs vérifiés existants ne fournissent pas d’ordonnancement des instructions au moment de la compilation, alors que cela pourrait réduire considérablement les temps d’exécution des programmes. Nous présentons ici un ordonnanceur d’instructions certifié efficace qui opère sur des superblocs (il peut déplacer des instructions à travers des branches), ainsi que son évaluation de performance, pour le compilateur CompCert.
Notre solution est constituée de plusieurs passes conçues par validation a posteriori : les transformations de programme sont proposées par des oracles non fiables, qui sont ensuite validées par des vérificateurs certifiés. Il en résulte un ordonnancement certifié des superblocs qui exploite la plus grande partie du parallélisme au niveau des instructions tout en restant efficace même pour de grands superblocs. De plus, notre optimisation est compatible avec les passes existantes au niveau du transfert de registre (RTL), est indépendante de l’architecture et pourrait être étendue pour certifier des transformations plus complexes.
L’impact global de notre travail sur le temps d’exécution est discuté, ainsi que la base flexible et réutilisable obtenue grâce à l’approche d’exécution symbolique.
2022/02/09: Alexandre Moine (Cambium, Inria), “Spécification et Vérification d’une structure de donnée transiente” abstract

Une structure de données transiente propose à la fois une interface éphémère et une interface persistante, tout en offrant des conversions efficaces entre les deux. Sek est une bibliothèque OCaml (disponible sur opam) qui propose des séquences transientes. L’implémentation de ces dernières repose sur des tableaux de capacité fixe, appelés chunks. Le partage de segments de chunks entre les différentes instances d’une structure est l’ingrédient clé pour des opérations efficaces. Cependant, un tel partage est un défi pour la formalisation. Dans cet exposé, je présenterai une preuve en logique de séparation avec crédits-temps de la correction fonctionnelle ainsi que de la complexité amortie d’une pile transiente et de ses itérateurs, reprenant les mécanismes utilisés par la bibliothèque Sek. Cette formalisation relève le défi du partage grâce à des arguments de monotonie et propose des spécifications permettant une invalidation sélective des itérateurs concurrents.
2022/02/02: Benoît Montagu, “Formal Verification of an Industrial μKernel: An Experience Report on the ProvenCore Project”
2022/01/26: Benjamin Farinier (TU Wien), “WebSpec: Towards Machine-Checked Analysis of Browser Security Mechanisms” abstract

The complexity of browsers has steadily increased over the years, driven by the continuous introduction and update of Web platform components, such as novel Web APIs and security mechanisms. Their specifications are manually reviewed by experts to identify potential security issues.
However, this process has proved to be error-prone due to the extensiveness of modern browser specifications and the interplay between new and existing Web platform components. To tackle this problem, we developed WebSpec, the first formal security framework for the analysis of browser security mechanisms, which enables both the automatic discovery of logical flaws and the development of machine-checked security proofs. WebSpec, in particular, includes a comprehensive semantic model of the browser in the Coq proof assistant, a formalization in this model of ten Web security invariants, and a compiler turning the Coq model and the Web invariants into SMT-lib formulas.
We showcase the effectiveness of WebSpec by discovering two new logical flaws caused by the interaction of different browser mechanisms and by identifying three previously discovered logical flaws in the current Web platform, as well as five in old versions. Finally, we show how WebSpec can aid the verification of our proposed changes to amend the reported inconsistencies affecting the current Web platform.
2022/01/19: Aurèle Barrière, “Génération vérifiée de code natif dans un compilateur JIT” abstract

Cet exposé présente les travaux en cours pour vérifier en Coq un prototype de compilateur à la volée. Les compilateurs à la volée modernes optimisent certaines fonctions dynamiquement. Les plus efficaces d’entre eux utilisent des techniques de compilation statique pour générer du code natif correspondant aux fonctions optimisées. L’exécution d’un programme avec un compilateur à la volée consiste alors en un entremêlement d’optimisations, d’exécution de code natif et d’exécution de code non compilé. L’exposé présentera une méthode de vérification d’un prototype de JIT avec génération de code natif. Cette méthode s’appuie d’une part sur les principes de preuve utilisés dans le compilateur CompCert, et d’autre part sur l’utilisation de monades pour contourner les limites de l’extraction de Coq, du fait de la manipulation impure de la pile d’exécution lors de la génération de code natif.
2022/01/12: Louis Noizet, “Handling Semantics with Necro”
2021/12/03: Santiago Bautista, “Analyses statiques pour la vérification semi-interactive de programmes”
2021/11/30: Théo Losekoot, “Automatic verification of functional programs with regular relations”
2021/11/19: Raphaël Monat (LIP6/Sorbonne Université), “A Multilanguage Static Analysis of Python/C Programs with Mopsa” abstract

Mopsa is a static analyzer platform, independent of language and abstraction choices. Developers are free to add arbitrary abstractions (numeric, pointer, memory, etc.) and syntax iterators for new languages. Mopsa encourages the development of independent abstractions which can cooperate or be combined to improve precision. In this talk, we will show how Mopsa analyses Python programs calling C libraries. It analyses directly and fully automatically both the Python and the C source codes. It reports runtime errors that may happen in Python, in C, and at the interface. We implemented our analysis in a modular fashion: it reuses off-the-shelf C and Python analyses written in the same analyzer. Our analyzer can tackle tests of real-world libraries a few thousand lines of C and Python long.
2021/11/17: Gautier Raimondi, “Type-Directed Program Transformation for Constant-Time Enforcement”
2021/11/15: Neea Rusch (Augusta U., USA), “Implementing the mwp-flow Analysis”abstract

Jones and Kristiansen’s mwp-flow analysis certifies polynomial bounds on the size of the values returned by an imperative program. This method is compositional, extensible and elegant, as it bounds transitions between states instead of focusing on states in isolation. While simple to use, this theoretical result is difficult to prove correct and implement. Here we detail the challenges a naive implementation has to face, and how we remedied them to offer a fast, efficient and extended implementation of the technique. The result, pymwp, is a lightweight tool to automatically perform data-size analysis of C programs. This effort prepares and enables the development of certified complexity analysis, by transforming a costly analysis into a tractable program, that leverages compositionality and decorelates the problem of finding the existence of a bound with its value.
2021/11/09: Guillaume Ambal, “Certified Abstract Machines for Skeletal Semantics”
2021/10/28: Arthur Charguéraud (Camus, Inria), “OptiTrust : un framework interactif pour la transformation de code source-à-source”
2021/10/26: Adam Khayam, “A practical approach for writing formalizations”
2021/10/19: Louis Noizet, “Stating Semantics with Skel”
2021/10/12: Thomas Genet, “Termination of Ethereum Contracts”
2021/10/05: Simon Castellan, “From Concurrent Programs to Petri Nets”
2021/10/05: Vincent Rébiscoul, “Dériver des Analyses Statiques à partir de Sémantiques Squelettiques”
2021/09/28: Frédéric Besson, “CPA beats ∞-CFA” FTfJP ’09
2021/09/21: Thomas Jensen, “Épicure”
2021/09/14: Benoît Montagu, “Control-Flow Analysis”
12/02/21: Vincent Rébiscoul, “Analyses Statiques pour Sémantiques Squelettiques”
05/02/21: Yann Herklotz (Imperial College, UK), “Formal Verification of High-Level Synthesis” abstract

High-level synthesis (HLS), which refers to the automatic compilation of software into hardware, is rapidly gaining popularity. In a world increasingly reliant on application-specific hardware accelerators, HLS promises hardware designs of comparable performance and energy efficiency to those coded by hand in a hardware description language like Verilog, while maintaining the convenience and the rich ecosystem of software development. However, current HLS tools cannot always guarantee that the hardware designs they produce are equivalent to the software they were given, thus undermining any reasoning conducted at the software level. Worse, there is mounting evidence that existing HLS tools are quite unreliable, sometimes generating wrong hardware or crashing when given valid inputs. To address this problem, we present the first HLS tool that is mechanically verified to preserve the behaviour of its input software. Our tool, called Vericert, extends the CompCert verified C compiler with a new hardware-oriented intermediate language and a Verilog back end, and has been proven correct in Coq. Vericert supports all C constructs except for case statements, function pointers, recursive function calls, integers larger than 32 bits, floats, and global variables. An evaluation on the PolyBench/C benchmark suite indicates that Vericert generates hardware that is around an order of magnitude slower and larger than hardware generated by an existing, optimising (but unverified) HLS tool.
29/01/21: Emmanuel Fleury
18/12/20: Aurèle Barrière, “Formally Verified Speculation and Deoptimization in a JIT Compiler” abstract

Just-in-time compilers for dynamic languages routinely generate code under assumptions that may be invalidated at run-time, this allows for specialization of program code to the common case in order to avoid unnecessary overheads due to uncommon cases. This form of software speculation requires support for deoptimization when some of the assumptions fail to hold. We present a model just-in-time compiler with an intermediate representation that explicits the synchronization points used for deoptimization and the assumptions made by the compiler’s speculation. We also present several common compiler optimizations that can leverage speculation to generate improved code. The optimizations are proved correct with the help of a proof assistant. While our work stops short of proving native code generation, we demonstrate how one could use the verified optimization to obtain significant speed ups in an end-to-end setting.
07/12/20: Solène Mirliaz, “Provable Cost Analysis of Cryptographic Code”
27/11/20: Adam Khayam, “JSkel – A JavaScript Formalization in Skeletal Semantics” Slides
13/11/20: Santiago Bautista, “Numeric domains meet algebraic types” Slides abstract

We report on the design and formalization of a novel abstract domain, called numeric path relations (NPRs), that combines numeric relational domains with algebraic data types. This domain expresses relations between algebraic values that can contain scalar data. The construction of the domain is parameterized by the choice of a relational domain on scalar values. The construction employs projection paths on algebraic values, and in particular projections on variant cases, whose sound treatment is subtle due to mutual exclusiveness.
23/10/20: Timothée Haudebourg, “Regular Language Type Inference with Term Rewriting” Slides abstract

This paper defines a new type system applied to the fully automatic verification of safety properties on higher-order tree-processing functional programs. We use term rewriting systems to model the program and its semantics and tree automata to model algebraic data types. We define the regular abstract interpretation of the input term rewriting system where the abstract domain is a set of regular languages. From the regular abstract interpretation we derive a type system where each type is a regular language. We define an inference procedure for this type system which allows us check the validity of safety properties. The inference mechanism is built on an invariant learning procedure based on the tree automata completion algorithm. This invariant learning procedure is regularly-complete and complete in refutation, meaning that if it is possible to give a regular type to a term then we will eventually find it, and if there is no possible type (regular or not) then we will eventually find a counter-example.
16/10/20: Simon Castellan, “Causality: A monad for truly concurrent computation” Slides abstract

Causal models have always had difficulty to gain widespread use to model concurrent languages due to their difficulty both to define the semantics, and then to implement it. Recently, a methodology based on event structures called concurrent games made it possible to define the semantics for a wide variety of concurrent languages. However this semantics is abstract and untractable. Moreover, defining the model of a new language requires familiarity with the intricacies of the model. In this talk, I will present my efforts towards implementing causal and interactive semantics based on event structures. This implementation relies on a monad that supports effects for concurrency, which allows to write interpreter for concurrent languages in a straightforward way and automatically extract an event structure semantics out of it, without having to manipulate explicitly event structures.
09/10/20: Delphine Demange, “Intermittent Computing with Peripherals, Formally Verified” Website abstract

Transiently-powered systems featuring non-volatile memory as well as external peripherals enable the development of new low-power sensor applications. However, as programmers, we are ill-equipped to reason about systems where power failures are the norm rather than the exception. A first challenge consists in being able to capture all the volatile state of the application – external peripherals included – to ensure progress. A second, more fundamental, challenge consists in specifying how power failures may interact with peripheral operations. In this work, we propose a formal specification of intermittent computing with peripherals, an axiomatic model of interrupt-based checkpointing as well as its proof of correctness, machine-checked in the Coq proof assistant. We also illustrate our model with several systems proposed in the literature. This is joint work with Gautier Berthou, Pierre-Évariste Dagand, Rémi Oudin, and Tanguy Risset.
29/09/20: Guillaume Ambal, “Formal Transformations of Operational Semantics” Slides abstract

Skeletal semantics is a recent framework developed to formalize languages, from λ-calculus to JavaScript. Languages can be defined in a range of format, including the intuitive big-step (Natural Semantics), or the more detailed small-step (Structural Operational Semantics). In this talk, I will present a new fully-automatic generic transformation to derive a small-step semantics from a big-step one. We also generate a Coq proof of equivalence as a bonus.
29/09/20: Benoît Montagu, “Stable Relations and Abstract Interpretation of Higher-Order Programs” abstract

We present a novel denotational semantics for the untyped call-by-value λ-calculus, where terms are interpreted as stable relations, i.e. as binary relations between substitutions and values, enjoying a monotonicity property. The denotation captures the input-output behaviour of higher-order programs, and is proved sound and complete with respect to the operational semantics. The definition also admits a presentation as a program logic. Following the principles of abstract interpretation, we use our denotational semantics as a collecting semantics to derive a modular relational analysis for higher-order programs. The analysis infers equalities between the arguments of a program and its result — a form of frame condition for functional programs.
18/09/20: Thomas Rubiano, “WebAssembly : sémantique et sécurité”