Research


Overall objectives

The Kairos ambitions are to deal with the Design of Cyber-Physical Systems (CPS), at various stages, using Model-Based techniques and Formal Methods. Design here stands for co-modeling, co-simulation, formal verification and analysis activities, with connections both ways from models to code (synthesis and instrumentation for optimization). Formal analysis, in turn, concerns both functional and extra-functional correctness properties. Our goal is to link these design stages together, both vertically along the development cycle, and horizontally by considering the interactions between cyber/digital and physical models. These physical aspects comprise both physical environments and physical execution platform representations, which may become rather heterogeneous as in the cases of the Internet of Things (IoT) and computing at the edges of the gateways. The global resulting methodology can be tagged as Model-Based, Platform-Based CPS Design, see Figure 1.

Figure 1: Cyber-Physical generic architectural features

CPS design must take into account all 3 aspects of application requirements, execution platform guarantees and contextual physical environment to establish both functional and temporal correctness. The general objective of Kairos is thus to contribute in the definition of a corresponding design methodology, based on formal Models of Computation for joint modeling of cyber and physical aspects, and using the important central concept of Logical Time for expressing the requirements and guarantees that define CPS constraints.

Logical Multiform Time. It may be useful to provide an introduction and motivation for the notion of Logical Multiform Time (and Logical Clocks), as they play a central role in our approach to Design. We call Logical Clock any repetitive sequence of occurrences of an event (disregarding possible values carried by the event). It can be regularly linked to physical time (periodic), but not necessarily so: fancy processors may change speeds, simulation engine change time-integration steps, or much more generally one may react with event-driven triggers of complex logical nature (do this after 3-times that unless this…). It is our belief that user specifications are generally expressed using such notions, with only partial timing correlations between distinct logical clocks, so that the process of realization (or “model-based compilation”) consists for part in establishing (by analysis or abstract simulation) the possible tighter relations between those clocks (unifying them from a partial order of local total orders to a global total order). We have defined in the past a small language of primitives expressing recognized constraints structuring the relations between distinct logical clocks 1, 13. This language (named CCSL for Clock Constraint Specification Language), borrows from notions of Synchronous Reactive Languages 16, Real-Time Scheduling Theory, and Concurrent Models of Computations and Communication (MoCCs) in Concurrency Theory 14 altogether. Corresponding extensions of Timed Models originally based on single (discrete or continuous) time can also be considered. Logical Time is used in our approach to express relation constraints between heterogeneous models, of cyber or physical origin, and to support analysis and co-simulation. Addressing cyber-physical systems demands to revisit logical time to deal with the multi-physical and sometimes uncertain environments.

In the following sections, we describe in turn the research agenda of Kairos on co-modeling, co-simulation, co-analysis, verification and relation from models to code, respectively.

Last activity report : 2022

Results

New results

Decision process for Logical Time

We are progressing the study of efficient methods for (syntactic) model-based expressiveness and (semantic) analysis/verification of multiform Logical Time (LT) formalisms, centered around our original CCSL constraint language (with possibly relevant extensions/restrictions) 20, 26. This is a topic of collaboration with Chinese partners, mostly in the context of our associated-team Plot4IoT (while travel exchanges have been on long stand-by this year).

In the starting thesis of Pavlo Tokariev, we are considering how distinctive CCSL features may be handled best by proper combinations of Boolean model-checking and integer linear programming solvers.

Finally, the PhD thesis of Hui Zhao on modeling of cyber-physical systems could be defended, after delays due to distantiation 38.

Formalizing and extending Smart Contracts languages with temporal and dynamic features

This year we have pursued the work on the ANR project SIM 9.3, which has started in January 2021. This project funds the PhD of Enlin Zhu. While we started focusing mainly on smart contracts and their acceptance by lawyers, we have now decided to focus on the distributed application as a whole since what can be done on the smart contract itself is not sufficient to have a global end-to-end control of the services. The main result is a state-based modeling of the recovery mechanism used in distributed ledger applications. This is a clear aspect where our formalisms are very pertinent to capture both state dependent properties and (logical) time requirement. This is also a place where the actual implementation and the temporality of the orchestration has a direct impact on the function to be achieved. It is indeed very good to show the benefit of a model-based approach to reason, conduct verification and then hope to generate correct-by-construction code. We mainly target the generation of the control part that puts into music the purely functional parts deployed either on smart contracts or on some of the actors of the distributed service (including oracles). While this result has been submitted twice, it was not yet accepted by a reviewing committee.

Another main result of this year is to understand how modifiable smart contracts, as introduced in an object-oriented setting 4, can increase smart contract flexibility, as modifiable contracts allow for changes to be made to the terms of the contract after it has been deployed. This can be particularly useful in situations where the contract needs to adapt to changing circumstances or evolving business needs. Modifiable smart contracts can also improve efficiency, as they allow for updates to be made to existing contracts rather than requiring the creation and deployment of entirely new contracts. This is the topic of the Ph.D. of Mansur Khazeev.

CCSL extension to stochastic Logical Time

As a follow-up of research from previous years, we considered in the starting PhD thesis of Arseniy Gromovoy the “light-weight” stochastic extension of CCSL with statistical purposes. The idea is to admit that a clock can be faster than (or also included in) another one safe for a few occasions. Unlike fancy stochastic model-checkers, which aim at computer percentage likeness of property satisfaction, we attempt to reduce probability aspects as soon as possible to a known number of extra steps in reachable state space construction, to reduce the problem to more tractable, homogeneous data structures in verification so that to be able to reduce the problem.

Safety rules for autonomous driving

This work, conducted in the context of a CIFRE PhD contract with Renault Software Factory, led to the PhD defense of Joëlle Abou Faysal in June 2022 37, and to her hiring as R&D engineer in this company. We provided a formal language to describe and verify safety rules for autonomous vehicles, with a concrete syntax meant to be close to Safety Engineers expectations, and a semantics allowing formal verification based on our modeling and verification environments around logical time 28, 34. Note that from September 2022, we are revisiting this language in the newly started PhD thesis of Maksym Labzhaniia, where the notion of logical space, and its mapping to physical space is introduced.

Take-over between automated and human automotive driving

This work was conducted as part of PSPC project ADAVEC 9.4, coordinated for UCA and Inria by Amar Bouali, and with industrial partners the local companies EpicnPoc and Avisto, and a back-up from Renault Research Factory in Sophia-Antipolis.

The goal is to study modeling of disengagement principles, by which driving responsibility may be taken over in between either a human or an automatic IA driver. This requires the formalization of take-over rules, which strongly rely on observation of both the outside road environment, but also the cockpit and the human driver state. This results in a complex software system that can be connected either to simulators of various levels, or to actual sensors embedded in the “Bowl”, a physical modular driving cockpit meant for experimentation and demonstrated by EpicnPoc at the Consumer Electronic Show (CES) in Las Vegas, in January 2022. A special focus was put this year on Software-Oriented Architecture (SOA) as a mean to connect software components efficiently. Work on CPS sustainability has also be shown in the project 39.

On the generation of compiler friendly C code from CCSL specification

This represents essentially the starting PhD work of Baptiste Allorant, newly hired in the team. The goal is to produce efficient sequential C codes from a CCSL description. A first version of the tool named ECCOGEN is completed, and under intensive testing. We are currently conducting extensive performance evaluation to test thousands of generated compiler friendly codes. The tool can run on four prevalent compilers: GNU gcc, INTEL icc, CompCert, and CLANG/LLVM. In addition to the small benchmarks of CCSL at hand, we produce synthetic ones with random generation techniques as well. A detailed research report is under way. The next step will be to extend ECOGEN in order to generate tuned parallel C code.

Continuation of the work on the program recognition with neural networks

This is joint work with Enrico Formenti, professor at UCA, including the co-supervision of master 2 UCA intern Alessandro Pepegna, also as a follow-up of work started last year with Corentin Fossati. Alessandro added two main contributions:

  • Implementing with LLVM a tool that computes the data dependency graph of a C code, and prints it in a graph file format. This graph is annotated with some compiler semantic attributes to grab the semantics of the C code.
  • Implementing with LLVM a tool that generates multiple data dependency versions from the same C code, by considering multiple combinations of compiler flags and phase orderings. All these versions of the same C code will be used in the machine learning phase (training the neural network).
  • Behavioral semantics in Model Driven Engineering

    In the last years, in collaboration with other academic researchers, we developed the GEMOC studio, a language workbench to develop new modeling or programming languages and their interactions, based on a formal concurrency model.

    Two new results emerged this year. First, we developed the formal, concurrent and timed operational semantics of platform aware Modal Sequence Diagrams 22 (MSD). The originality was to encode in the behavioral semantics the refinement resulting from the deployment of an application into a specific hardware architecture. More precisely, Platform aware Modal Sequence Diagram extends Modal Sequence Diagram, which allows to specify timed requirements, with a high level description of the hardware platform together with the deployment of Sequence Diagrams. This enabled the early analysis of requirements validity. This has been extensively tested and the resulting artifacts are available here, while a companion web page is accessible here. Second, we figured out that it may be difficult for a user to debug concurrency since depending on the systems, some interleaving are not of interest and make difficult the exploration of significant ones. From these experiments, we decided to built up upon the Debugger Adapter Protocol to allow concurrency debugging. It implies among other a place where exploration strategies can be defined by the user (the user being either the language designer or the system designer). The concurrency strategies hide the irrelevant interleaving to highlight the important ones 27. These strategies are defined independently of the execution engine and implemented into the GEMOC studio. The companion webpage is accessible here. Currently, we are investigating further the notion of refinement and its impact at the system engineering level through the notion of model fidelity. This is specifically addressed in the PhD thesis of Joao Cambeiro where we are characterizing the capability of different models to mimic a same reality.

    Efficient parallelism in shared memory

    We progressed further the technology underlying the LoPhT tool 6.1.6. Recent results are published in 32, and we will continue these efforts as part of the recently accepted ANR Caotic project. Ongoing discussions are taking place with Airbus concerning the continued evaluation of LoPhT for industrial use.

    A language and compiler for embedded and Real-Time Machine Learning programming

    This research axis connects three research and engineering fields: Real-Time/Embedded programming, High-Performance Compilation, and Machine Learning. Work has been funded by the Confiance.AI PIA project and by the ES3CAP project. Hugo Pompougnac defended his PhD thesis on this topic in December 2022.

    The Static Single Assignment (SSA) form has proven an extremely useful tool in the hands of compiler builders. First introduced as an intermediate representation (IR) meant to facilitate optimizations, it became a staple of optimizing compilers. More recently, its semantic properties2 established it as a sound basis for High-Performance Computing (HPC) compilation frameworks such as MLIR, where different abstraction levels of the same application3 share the structural and semantic principles of SSA, allowing them to co-exist while being subject to common analysis and optimization passes (in addition to specialized ones).

    But while compilation frameworks such as MLIR concentrate the existing know-how in HPC compilation for virtually every execution platform, they lack a key ingredient needed in the high-performance embedded systems of the future—the ability to represent reactive control and real-time aspects of a system. They do not provide first-class representation and reasoning for systems with a cyclic execution model, synchronization with external time references (logical or physical), synchronization with other systems, tasks and I/O with multiple periods and execution modes.

    And yet, while the standard SSA form does not cover these aspects, it shares strong structural and semantic ties with one of the main programming models for reactive real-time systems: dataflow synchrony, and its large and structured corpus of theory and practice of RTE systems design.

    Relying on this syntactic and semantic proximity, we have extended the SSA-based MLIR framework to open it to synchronous reactive programming of real-time applications. We illustrated the expressiveness of our extension through the compilation of the pure dataflow core of the Lustre language. This allowed us to model and compile all data processing, computational and reactive control aspects of a signal processing application4. In the compilation of Lustre (as embedded in MLIR), following an initial normalization phase, all data type verification, buffer synthesis, and causality analysis can be handled using existing MLIR SSA algorithms. Only the initialization analysis specific to the synchronous model (a.k.a. clock calculus or analysis) requires specific handling during analysis and code generation phases, leading to significant code reuse.

    The MLIR embedding of Lustre is non-trivial. As modularity based on function calls is no longer natural due to the cyclic execution model, we introduced a node instantiation mechanism. We also generalized the usage of the special undefined/absent value in SSA semantics and in low-level intermediate representations such as LLVM IR. We clarified its semantics and strongly linked it to the notion of absence and the related static analyses (clock calculi) of synchronous languages.

    Our extension remains fully compatible with SSA analysis and code transformation algorithms. It allows giving semantics and an implementation to all correct SSA specifications. It also supports static analyses determining correctness from a synchronous semantics point of view.

    First results, obtained in collaboration with Google, have been published in ACM TACO 23 and was presented in the HiPEAC 2022 conference. A second paper is under writing. The mlirlus software has been published as open-source.

    An ANR project proposal has been submitted, and our participation in the Confiance.AI national initiative (in project EC7 – embedded AI) continues.

    Formal verification of Logical Execution Time applications

    We conducted deep investigations on how the PsyC language, developed by the Krono-Safe company, associates precisely features both from Synchronous Reactive languages (notions of Multiform Logical Time, logical and pseudophysical clocks, clock-based temporal programming primitives) and from Logical Execution Time (LET) languages (mandatory exact execution time intervals for non-elementary computation blocks). We derived a direct operational semantics as well as a translation semantics to Esterel or Lustre, which fully comply at the instants where external behaviors can be observed (input/output of temporal values). A technical report is under progress, which can be considered as a formal semantics description of the main language features. A consistent set of results has been presented at ERTS2022 conference 33.

    This work is conducted in the context of Fabien Siron CIFRE PhD thesis with the Krono-Safe SME, aiming at:

    • Verifying the correctness of a static scheduling plan with respect to a logical execution time specification provided in the PsyC language designed and developed by Krono-Safe.
    • Validating a logical time specification provided in PsyC with respect to logical time safety properties.

    Interference analysis in Real-Time task mapping

    Concurrent real-time tasks, from a single application or simply run on a common multicore processor, may interfere mutually because of shared resources. As a result, the global end-to-end worst-case latency is not always a simple combination of individual worst-case execution times taken in isolation. We considered the issue of modeling-for-analysis for such system-level features, with a concern on compatibility with existing design practices of industrial partners (Renault Software Factory and Thales_Alenia_Space in that case), in the context of IRT Saint-Exupery Archeocs collaborative project REFERENCE NOT FOUND: KAIROS-RA-2022/label/contracts:archeocs. This work led to the post-doc programme of Frédéric Fort, started in November 2022.

    Trustworthy Fleet deployment and management

    This activity is a follow-up of Nicolas Ferry previous activities in ENACT H2020 project, now renewed since his arrival in Kairos in the TrustFleet PHC and DYNABIC HEU projects 9.2.2. Continuous and automatic software deployment is still an open question for IoT systems, especially at the Edge and IoT ends. The state-of-the-art Infrastructure as Code (IaC) solutions are established on a clear specification about which part of the software goes to which types of resources. This is based on the assumption that, in the Cloud, one can always obtain the exact computing resources as required. However, this assumption is not valid on the Edge and IoT levels. In production, IoT systems typically contain hundreds or thousands of heterogeneous and distributed devices (also known as a fleet of IoT/Edge devices), each of which has a unique context, and whose connectivity and quality are not always guaranteed. In ENACT, we both investigated the challenge of automating the deployment of software on heterogeneous devices and of managing variants of the software which fit different types or contexts of Edge and IoT devices in the fleet.

    Results are presented in 24, 25. In 2022, GeneSIS (a part of the aforementioned results) was recognized by the EC innovation radar as highly innovant with high maturity level. The natural next step is to investigate how to guarantee the trustworthiness of the deployment when (i) the quality of the devices is not guaranteed, (ii) the context of each device is continuously changing in an unanticipated manner, and (iii) software components are frequently evolving in the whole software stack of each device. In such context, ensuring the proper ordering and synchronization of the deployment actions is critical to improve the quality and trustworthiness and to minimize the downtime. Kairos led a proposal for an EU project (Horizon Europe call HORIZON-CL3-2022-CS-01-02: Trustworthy methodologies, tools and data security “by design” for dynamic testing of potentially vulnerable, insecure hardware and software components), which has been submitted in November 2022 and where we aim at leading research activities along the main activities of the project on Co-modeling.

    Model-Based serverless platform for the Cloud-Edge-IoT continuum

    One of the most prominent implementation of the serverless programming model is Function-as-a-Service (FaaS). Using FaaS, application developers provide source code of serverless functions, typically describing only parts of a larger application, and define triggers for executing these functions on infrastructure components managed by the FaaS provider. The event-based nature of the FaaS model is a great fit for Internet of Things (IoT) event and data processing. However, there are still challenges that hinder the proper adoption of the FaaS model on the whole IoT-Edge-Cloud continuum. These include (i) vendor lock-in, (ii) the need to deploy and adapt serverless functions as well as their supporting services and software stack to the cyber-physical context in which they will execute, and (iii) the proper orchestration and scheduling of serverless functions deployed on the whole continuum. These observations are being confirmed by our ongoing systematic literature review initiated as part of the TrustFleet PHC aurora project in collaboration with SINTEF (Oslo, Norway), and performed as part of Barbara de Oliveira internship. In parallel, and to address the aforementioned challenges we initiated a first prototyping platform for the design, deployment, as well as maintenance of applications over the IoT-Edge-Cloud continuum 29. In particular, our platform enables the specification and deployment of serverless functions on Cloud and Edge resources as well as the deployment of their supporting services and software stack over the whole IoT-Edge-Cloud continuum. Next step will be to investigate solutions for the proper and timely orchestration and scheduling of these functions.

    Behavioral equivalence of Open Systems

    This work is conducted in part joint with Rabea Ameur-Boulifa, associate professor at Telecom Paris. We consider Open (concurrent) Systems where the environment is represented as a number of processes whose behavior is unspecified. Defining their behavioral semantics and equivalences from a Model-Based Design perspective naturally implies model transformations. To be proven correct, they require equivalence of “Open” terms, in which some individual component models may be omitted. Such models take into account various kind of data parameters, including, but not limited to, time. The middle term goal is to build a formal framework, but also an effective tool set, for the compositional analysis of such programs. A journal article on these topics was accepted 19.

    Weak bisimulation for open systems is significantly more difficult to decide in practice, because of the presence of unbounded loops in the definition of weak (unobservable) transitions. We have defined two different algorithms about weak bisimulation for open systems:

    • the first one, given two open systems and a candidate (weak) bisimulation relation between their states, searches for weak transitions (chains of strong symbolic transitions) establishing a correspondence between the 2 systems. This is a bounded search, to avoid infinite loops of unobservable moves. As a consequence, this algorithm is only semi-decidable.
    • the second one rewrites (simplifies) the symbolic automaton modeling an open system behavior, using a pattern-based approach that preserves weak bisimulation by construction.

    The corresponding development of algorithms and implementation on our VerCors platform, is in progress, in collaboration with ECNU Shanghai (despite confinement issues).

    Standardization and performance evaluation in ETSI oneM2M standard

    Within the standards for M2M and the Internet of Things, managed by ETSI, oneM2M, and in the context of the new started contract TTF 1019 (2022-2024), we are studying performance evaluation of some (but not all) open source initiatives of the oneM2M standard. More precisely, we are extending the (actually weak) Semantic Discovery mechanism of the IoT oneM2M standard. The goal is to enable an easy and efficient discovery of information and a proper inter-networking with external source/consumers of information (e.g. a database in a smart city or in a firm), or to directly search information in the oneM2M system for big data purposes. oneM2M ETSI standard has currently rather weak native discovery capabilities that work properly only if the search is related to specific known sources of information (e.g. searching for the values of a known set of containers) or if the discovery is very well scoped and designed (e.g. the lights in a house).

    Advanced Semantic Discovery for IoT objects and networks

    Within the shield of ETSI (European Telecommunication Standard Institute), we study and participate to the Advanced Semantic Discovery (ASD) and query capabilities for the oneM2M standard. The goal is to enable an easy and efficient discovery of information and a proper interworking with external source/consumers of information (e.g. a distributed data base in a smart city or in a firm), or to directly search information in the oneM2M system for big data purposes. This is a direct follow-up of our researches on looking for suitable mechanisms and protocols to perform an ASD as introduced in 9 and 42.

    oneM2M has currently native discovery capabilities that work properly only if the search is related to specific known sources of information (e.g. searching for the values of a known set of containers) or if the discovery is well scoped and designed (e.g. the lights in a house). When oneM2M is used to discover wide sets of data or unknown sets of data, the functionality is typically integrated by ad hoc applications that are expanding the oneM2M functionality. This means that this core function may be implemented with different flavors, which is not optimal for interworking and interoperability. For further information, we refer to the ended contract STF 589 and the software released in ETSI Labs ASD.

    Asynchronous Contact Tracing to catch Covid-19 using IoT objects and IoT networks

    Within the shield of ETSI, we standardised  43 a novel contact tracing protocol, called Asynchronous Contact Tracing (ACT). ACT traces the IoT connected objects that may have been infected by the Covid-19 virus (or future pandemic viruses). This shifts the paradigm, from searching for a person in the process of infecting another to the tracing of both potential contamination and infections, and leveraging on the combination of the two information.

    ACT Web and Mobile application:

    ACT Web and Mobile application:

    Figure 2: ACT Web and Mobile application

    In 2022, with the Master student Alessio di Dio, we released the first implementation of ACT, consisting in 3 modules: (i) an ETSI/oneM2M communication infrastructure, (ii) a mobile application (android), and (iii) a web application 40, see Figure 2.

    The new ACT method will require the use of existing ready-to-market IoT-based technology and well-established wireless network techniques, in particular the ones specified in the ETSI standards ecosystem. Moreover, it will preserve the user’s privacy in accordance with GDPR and/or other regional requirements not requiring the transmission of any personal information by the user 36.

    Federating heterogeneous Digital Contact Tracing platforms

    In this work, done in collaboration with the University of Novi-Sad and the Mathematical Institute of the Serbian Academy of Sciences and Arts, both in Serbia, we present a comprehensive, yet simple, extension to the existing systems used for Digital Contacts Tracing (DGT) in Covid-19 pandemic. The extension, called BAC19, will enable those systems, regardless of their underlying protocol, to enhance their sets of DGT contacts and to improve global fight against pandemic during the phase of opening boarders and enabling more traveling. BAC19 is a structured overlay network, or better, a Federation of mathematical Distributed Hash Tables (DHT), as introduced in 11, 15. Its model is inspired by the Chord and Synapse structured overlay networks. The published work presents the architecture of the overlay network federation and shows that the federation can be used as a formal model of forward contact tracing 21.

    Logical Time modeling and implementation of distributed systems

    As part of a scientific collaboration with Google (US) we are exploring the modeling and implementation of distributed heterogenous systems using logical time formalisms. The collaboration should receive formal support in the beginning of 2023.

    A middleware for the experimentation on IoT mobile networks

    In the context of studies on decentralized algorithms from mobile dynamic networks, we investigated the state of the art of the experimentation tools. We discovered that existing solutions, either coming from labs or companies, do not match the requirements of experimentation as it is usually done by Researchers. Indeed commercial products focus on reliability and interoperability at the expense of versatility, while lab tools most often serve as proof of concepts. The experimental study of algorithms requires the availability, in a single solution, of the following features: support for both synchronous and asynchronous communication, simulation/emulation of large systems, fast deployment, Web interoperability, and full decentralization, just to name a few. Idawi was designed and implemented to the very purpose of providing the Research community with a tool tailored to its needs. The resulting middleware has been released on the MAVEN global distribution platform for Java software. MAVEN Central statistics indicate that Idawi binaries have been downloaded 350 times in the year 2022. In addition to this, the Idawi and its satellite tools are all Open Source. They are released under the Apache V2 license. For the sake of Open Science, their source code is fully available on GitHub, see 30, 31, 41.

     

    ……….

    Comments are closed.