Team Ecuador studies Algorithmic Differentiation (AD) of computer programs, blending :
AD theory:We study software engineering techniques, to analyze and transform programs mechanically. Algorithmic Differentiation (AD) transforms a program Pthat computes a function P’that computes analytical derivatives of adjoint modeof AD, a sophisticated transformation that yields gradients for optimization at a remarkably low cost.
AD application to Scientific Computing:We adapt the strategies of Scientific Computing to take full advantage of AD. We validate our work on real-size applications.
We aim to produce AD code that can compete with hand-written sensitivity and adjoint programs used in the industry. We implement our algorithms into the tool Tapenade, one of the most popular AD tools at present.
Our research directions :
- Efficient adjoint AD of frequent dialects e.g. Fixed-Point loops.
- Development of the adjoint AD model towards Dynamic Memory Management.
- Evolution of the adjoint AD model to keep in pace with with modern programming languages constructs.
- Optimal shape design and optimal control for steady and unsteady simulations. Higher-order derivatives for uncertainty quantification.
- Adjoint-driven mesh adaptation.
Last activity report : 2021
- 2021 : PDF – HTML
- 2020 : PDF – HTML
- 2019 : PDF – HTML
- 2018 : PDF – HTML
- 2017 : PDF – HTML
- 2016 : PDF – HTML
- 2015 : PDF – HTML
- 2014 : PDF – HTML
Algorithmic Differentiation of OpenMP
For applications that are parallelized for multi-core CPUs or GPUs using OpenMP, it is desirable to also compute the gradients in parallel. We extended the AD model of Tapenade (source transformation, association by address, storage on tape of intermediate values) towards correct and efficient differentiation of OpenMP parallel worksharing loops, one of the most commonly used OpenMP features, in tangent-linear and adjoint mode. This work was published in ACM TOMS .
The major issue raised by the adjoint mode is the transformation of variable reads into adjoint variable overwrites, more accurately into increments. While there is no parallel conflict between two reads, two concurrent increments can cause a data race, unless they are both atomic. Classical automated detection of independence is as always limited. We propose to gather information about the memory access patterns of the original code, which is assumed correct and therefore free of data races, and to reuse this information into a theorem prover with which we check the safety of the shared memory accesses of the adjoint code.
A poster was accepted for presentation at PPoPP’22. An article is in preparation.
Algorithmic Differentiation of CUDA
In collaboration with ONERA, we study extension of Tapenade to the parallel constructs of CUDA. The industrial objective is to include Adjoint AD natively into the successor of ONERA’s “Elsa” CFD platform. Our research objective is to extend our adjoint AD model to CUDA code. While this extension bears similarity with our work on OpenMP, several specific aspects of CUDA deserve a specific treatment. For instance the stack storage mechanism central to our adjoint AD model must be redesigned to take into accout the memory limitations of GPU code sections.
This year, we delivered to ONERA a version of Tapenade that can parse and regenerate CUDA C source, and that can differentiate in tangent mode a few of the test cases provided by ONERA.
A joint article is in preparation about the general architecture of ONERA’s new CFD solver, that includes a section on integrating AD into the compilation chain and on the needed adaptions of Tapenade.
Adjoint Differentiation and Garbage Collection
Data-Flow reversal is at the heart of our model of Source-Transformation Adjoint Algorithmic Differentiation (Adjoint ST-AD). The presence of addresses, pointers, and pointer arithmetics in the target language poses challenges to Data-Flow reversal, which we can deal with in languages such as C. However, when the target language uses Garbage Collection (GC), for instance Java or Python, the notion of address that is used by Data-Flow reversal disappears. Moreover, GC is asynchronous and does not appear explicitly in the source. We studied an extension to the model of Adjoint ST-AD suitable for a language with GC. We validated this approach on a Java implementation of a simple Navier-Stokes solver. We compared performance with alternative models of Adjoint AD, such as Overloading-based AD (e.g. ADOL-C) which by design could handle GC more easily.
Application to large industrial codes
We support users with their first experiments of Algorithmic Differentiation of large codes. This concerned two codes this year.
One application was done by Shreyas Gaikwad, University of Texas at Austin, PhD student supervised by Patrick Heimbach. His goal is to produce the adjoint of glaciology codes such as SICOPOLIS and the glaciology component of the MIT GCM. Both are Fortran 90 codes that have been previously differentiated with OpenAD, the former AD tool developed by Argonne National Lab. Krishna Narayanan provided crucial help and expertise, as he had been in charge of the previous differentiation with OpenAD. Indeed, differentiation with Tapenade did not encounter bugs in the tool itself. It rather underlined interface and documentation difficulties to apply special recommended strategies e.g. for adjoint AD of linear solvers.
The other code this year is CTFEM, an element of the code suite developed by the CASTOR team of INRIA and University of Nice for the ITER project. CTFEM is a plasma simulation code written in Fortran90. Together with Hervé Guillard, we applied Tapenade to produce the adjoint code of CTFEM. In addition to several Tapenade bugs, now fixed, we encountered two more interesting issues. One is that CTFEM introduces memory aliasing at a few locations. Classically, memory aliasing is adverse to adjoint AD and should be avoided. In general, the AD tool emits a warning message when potential aliasing is detected. It unfortunately failed to do so in a few particular cases. The other issue is about array notation: Fortran90 actual parameters of calls that are arrays are in principle passed by reference, allowing the called procedure to modify the actual parameter. This is also true for array sections, i.e. actual parameters of the form
T(0:10:2). On the other hand, an array section that uses an indirection such as T(ind(0:10:2)) appears to be passed by value, although we found no literature on the subject. Tapenade now points out this adverse situation, which can be solved by a simple local code rewrite.
The progress in highly accurate schemes for compressible flows on unstructured meshes (together with advances in massive parallelization of these schemes) allows to solve problems previously out of reach. The four-years programme , associating:
- IMAG of Montpellier University (B. Koobus, coordinator),
- Computational AeroAcoustics Laboratory (CAALAB) of Keldysh Institute of Moscow (T. Kozubskaya, head), and
- Ecuador of INRIA Sophia-Antipolis
is supported by the French ANR and by the Russian Science Foundation. Norma is a cooperation on the subject of the extension of Computational AeroAcoustics methods to the simulation the noise emission by rotating machines (helicopters, future aerial taxis, unmanned aerial vehicles, wind turbines…). The tasks of INRIA in this Russian-French cooperation program are:
- the advancement of two numerical techniques, namely:
- a higher-order approximation scheme for compressible Navier-Stokes equations, and
- mesh adaptation methods for the flows under study (DES modeled flows able to model noise generation).
Among this year’s results:
- We delivered a review on High-Order methods for compressible CFD.
- We defined CENO3D, a 3D finite-volume scheme relying on cell-based reconstruction and fourth-order accurate, and we developed and tested it in a CFD software.
- We installed an anisotropic metric-based mesh adaptation for rotating machines and we validated it on a first test case, a mixing device rotating in a cylinder. The study of the next test case, the Caradonna-Tung helix (Norma test case 1.1.) is under progress.
- We started the design of a new mesh adaptation based on H-P principle (simultaneous adaptation of the mesh and of the scheme accuracy), in order to combine the higher-order scheme CENO3D with our anisotropic metric-based mesh adaptation technology.
Modeling turbulence is an essential aspect of CFD. The purpose of our work in hybrid RANS/LES (Reynolds Averaged Navier-Stokes / Large Eddy Simulation) is to develop new approaches for industrial applications of LES-based analyses. In the applications targeted (aeronautics, hydraulics), the Reynolds number can be as high as several tens of millions, far too high for pure LES models. However, certain regions in the flow can be predicted better with LES than with usual statistical RANS (Reynolds averaged Navier-Stokes) models. These are mainly vortical separated regions as assumed in one of the most popular hybrid models, the hybrid Detached Eddy Simulation (DES) model. Here, “hybrid” means that a blending is applied between LES and RANS. An important difference between a real life flow and a wind tunnel or basin is that the turbulence of the flow upstream of each body is not well known.
The development of hybrid models, in particular DES in the litterature, has raised the question of the domain of validity of these models. According to theory, these models should not be applied to flow involving laminar boundary layers (BL). But industrial flows are complex flows and often present regions of laminar BL, regions of fully developed turbulent BL and regions of non-equilibrium vortical BL. It is then mandatory for industrial use that the new hybrid models give a reasonable prediction for all these types of flow. We concentrated on evaluating the behavior of hybrid models for laminar BL and for vortical wakes. While less predictive than pure LES on laminar BL, some hybrid models still give reasonable predictions for rather low Reynolds numbers.
During the first phase of Norma, Montpellier and Moscow are computing a series of initial test cases in order to control the consistancy of the results produced by the two platforms of CFD, namely Noisette for Moscow, and Aironum for Montpellier.
The physical problem addressed by Norma involves a computational domain made of at least two components having different rotative motions. The numerical problem of their combination gave birth to many specialized schemes, such as the so-called sliding method, chimera method, immersed boundary method (IBM). In concertation with Moscow, Montpellier is introducing a novel IBM in the CFD code Aironum. The Ecuador team is studying in cooperation with Lemma engineering (Sophia Antipolis) a novel sliding/chimera method.
High order approximations
After many decades of research which we suumarized in , approximation of unstructured meshes have become the common practice in CFD. High order approximations for compressible flows on unstructured meshes are facing many constraints that increase their complexity i.e. their computational cost. This is clear for the largest class of approximation, the class of
Control of approximation errors
Reducing approximation errors as much as possible by changing the mesh is a particular kind of optimal control problem. We formulate it exactly this way when we look for the optimal metric of the mesh, which minimizes a user-specified functional (goal-oriented mesh adaptation). In that case, the usual methods of optimal control apply, using adjoint states that can be produced by Algorithmic Differentiation.
This year, we extended mesh adaptation methods to the simulation of rotating machines is a special unsteady periodic flow. We also continued our new analysis for h-p anisotropic mesh adaptation.