GdT OT-PDE-ML

Organisateurs :

Thomas Gallouët (thomas.gallouet@inria.fr)
Quentin Mérigot (quentin.merigot@universite-paris-saclay.fr)
Luca Nenna (luca.nenna@universite-paris-saclay.fr)
Katharina Eichinger (katharina.eichinger@inria.fr)

NB : nous écrire pour être ajouté à la liste de diffusion du GdT

Où nous trouver :

Institut de mathématique d’Orsay (IMO)bâtiment 307. Salle 1A13.

Prochaines séances :

13/06/2025 (salle 1A13)
- 14h00 Giovanni Conforti
  - Title: Semiconcavity of entropic potentials and exponential convergence of Sinkhorn’s algorithm
  - Abstract: The entropic optimal transport problem is a regularised version of the classical optimal transport problem which consists in minimising relative entropy against a reference distribution among all couplings of two given marginals. In this talk, we study stability of optimisers and exponential convergence of Sinkhorn’s algorithm, that is widely used to solve EOT in practice. In the first part of the talk, we will illustrate how semiconcavity of dual optimal variables, known as entropic potentials, plays a key role in understanding both problems. In the second part of the talk we discuss how to establish semiconcavity bounds in examples of interest such as log-concave marginals or marginals with bounded support. Joint work with A. Chiarini, G.Greco and L. Tamanini

Séances passées:

19/05/2025 (salle 1A13)
- 14h00 Averil Aussedat
  - Title: Towards a characterization of the geometric tangent cone to the Wasserstein space
  - Abstract: The Wasserstein space can be given a family of geometric tangent cones built from geodesics. These tangent cones form a subset of measures on the tangent bundle, and most often, a strict subset. This talk addresses the following problem: can we know if an element belongs to the tangent cone by looking at the directional derivative of the Wasserstein distance along it? We give some context around this question, the currently available results and some open developments.
- 14h55 Julie Delon
  - Title: Beyond Wasserstein: a general fixed-point approach for computing barycentres of measures (Eloi Tanguy, Julie Delon, and Nathaël Gozlan)
  - Abstract: In this talk, we investigate the computation of barycentres for generic optimal transport costs, extending beyond the well-studied Wasserstein barycentres. We introduce a fixed-point algorithm that generalizes prior works and prove its convergence under minimal assumptions. Our approach applies to a broad class of cost functions and probability measures, providing a flexible framework for the numerical computation of generic transport barycentres.
- 15h50 Clément Soubrier
  - Title: Applying optimal transport based metrics to analyze biological shape heterogeneity
  - Abstract: Analyzing shapes is a key step to understand biological functions and properties of cells and proteins. Subtle shape variations and conformations can have a significant impact on the interactions with the shape environment. In order to capture theses variations, we developed optimal transport based alignment tools and metrics, enabling to quantify shape heterogeneity and classify morphotypes. In particular I will present the numerical challenges involved in both alignment and metric computation, as well as examples of biological data classification.
31/03/2025 (salle 1A13)
- 14h00 Johannes Hertrich
  - Title: Wasserstein Gradient Flows for Maximum Mean Discrepancies with Riesz Kernels
  - Abstract: Motivated by kernel-based loss functions in machine learning, we study Wasserstein gradient flows with respect to the maximum mean discrepancy (MMD) with Riesz kernels, which is also known as energy distance. Since the kernel is non-smooth and not $\lambda$-convex, most of the standard theory on Wasserstein gradient flows does not apply in this case. As a remedy, we characterize Wasserstein gradient flows via steepest descent directions, which allows us to compute analytic solutions for some special cases. We pay special attention to the one-dimensional case, where the Wasserstein space can be isometrically embedded into a Hilbert space, which allows an explicit description of the Wasserstein gradient flows via one-dimensional ODEs. Numerically, the computation of MMD flows can be costly due to the double integral. As a remedy, we propose the fast evaluation of kernel matrix-vector multiplication via slicing. Using this fast computation of the MMD, we can apply the MMD flows to large-scale problems and present an example for generative modelling on standard image datasets.
- 15h15 Christophe Vauthier
  - Title: Critical points of the Sliced-Wasserstein distance
  - Abstract: The SW metric has gained significant interest in the optimal transport and machine learning literature, due to its ability to capture intricate geometric properties of probability distributions while remaining computationally tractable, making it a valuable tool for various applications, including generative modeling and domain adaptation. Our study aims to provide a rigorous analysis of the critical points arising from the optimization of the SW objective. By computing explicit perturbations, we establish that stable critical points of SW cannot concentrate on segments. This stability analysis is crucial for understanding the behaviour of optimization algorithms for models trained using the SW objective. Furthermore, we investigate the properties of the SW objective, shedding light on the existence and convergence behavior of critical points. We illustrate our theoretical results through numerical experiments.
10/03/2025 (salle 1A13)
- 14h00 Borjan Geshkovski
  - Title: Measure-to-measure interpolation using Transformers
  - Abstract: Transformers are deep neural network architectures that underpin the recent successes of large language models. Unlike more classical architectures that can be viewed as point-to-point maps, a Transformer acts as a measure-to-measure map implemented as specific interacting particle system on the unit sphere: the input is the empirical measure of tokens in a prompt and its evolution is governed by the continuity equation. Transformers are not limited to empirical measures and can in principle process any input measure. We provide an explicit choice of parameters that allows a single Transformer to match N arbitrary input measures to N arbitrary target measures, under the minimal assumption that every pair of input-target measures can be matched by some transport map.
- 15h15 Erwan Stämpfli
  - Title: Ballistic Benamou Brenier formulation of Porous media and viscous Burgers equations
  - Abstract: We study a variationnal formulation of the porous media equation and viscous Burgers equation established by Yann Brenier. We recover the solutions from the minimizer of a space-time functional which bears a striking ressemblance to the Benamou Brenier formulation of optimal transportation. We construct a discretization and establish convergence of the associated numerical scheme.
27/01/2025 (salle 1A13)
- 14h00 Maxime Sylvestre
  - Title: Estimées de régularité du transport optimal via régularisation entropique
  - Abstract: Le théorème de contraction de Caffarelli assure le caractère Lipschitz du transport optimal entre une gaussienne et une mesure à log-densité fortement concave. En 2022 Chewi et Pooladian ont proposé une preuve de ce théorème utilisant la version entropique du transport optimal. Nous proposons ici une extension de ces deux résultats qui se fonde sur l’inégalité de Prekopa-Leindler. L’utilisation de l’inégalité de Prekopa-Leindler permet de relâcher les hypothèses de régularité sur les log-densité et d’introduire une anisotropie. Nous en déduisons des résultats de régularité et de croissance pour le transport optimal lorsque la cible est log-concave.
- 15h15 Raphaël Barboni
  - Title: We study the convergence of gradient flow for the training of deep neural networks. If Residual Neural Networks are a popular example of very deep architectures, their training constitutes a challenging optimization problem due notably to the non-convexity and the non-coercivity of the objective. Yet, in applications, those tasks are successfully solved by simple optimization algorithms such as gradient descent. To better understand this phenomenon, we focus here on a “mean-field” model of infinitely deep and arbitrarily wide ResNet, parameterized by probability measures over the product set of layers and parameters and with constant marginal on the set of layers. Indeed, in the case of shallow neural networks, mean field models have proven to benefit from simplified loss-landscapes and good theoretical guarantees when trained with gradient flow for the Wasserstein metric on the set of probability measures. Motivated by this approach, we propose to train our model with gradient flow w.r.t. the conditional Optimal Transport distance: a restriction of the classical Wasserstein distance which enforces our marginal condition. Performing a local Polyak-Łojasiewicz analysis, we show convergence of the gradient flow for well-chosen initializations: if the number of features is finite but sufficiently large and the risk is sufficiently small at initialization, the gradient flow converges towards a global minimizer.
16/12/2024 (salle 2P8)
- 14h00 Arthur Stephanovitch
  - Title: Smooth transport maps via diffusion and applications to generative models
  - Abstract: We show that the Langevin map transporting the d-dimensional Gaussian to a k-smooth deformation is (k+1)-smooth. We give applications of this result to functional inequalities as well as generative models.
- 15h15 Louis-Pierre Chaintron
25/11/2024
- 14h00 Pablo López Rivera
  - Titre : Un taux pour la convergence uniforme des potentiels entropiques
  - Résumé : Dans le cadre euclidien quadratique, le théorème de Brenier-McCann nous dit que, sous de faibles hypothèses, le problème de transport optimal entre deux mesures a une solution unique qui possède une structure bien définie : l’application optimale correspond au gradient d’une fonction convexe, que l’on appelle potentiel de Brenier. Cependant, leur estimation est difficile, car cela revient à résoudre l’équation de Monge-Ampère associée, une EDP non linéaire d’ordre deux. Cependant, si l’on régularise le problème de transport optimal en ajoutant une entropie modulée par un paramètre de température, cette régularisation entropique nous fournit une approximation de l’application optimale si la température est basse. Dans cet exposé, j’exhiberai un taux de convergence pour les potentiels entropiques et leurs gradients vers leurs équivalents non régularisés, pour la convergence uniforme sur tout compact, sous des hypothèses de convexité.
24/10/2024
- 14h00 Anastasia Hraivoronska
  - Titre : The fully discrete JKO scheme for nonlinear diffusion and crowd motion models
  - Résumé : This talk presents a formulation of the JKO scheme restricted to atomic measures on the regular grid as a discrete-in-space approximation to the standard JKO scheme. We discuss the application of this fully discrete formulation for developing new numerical schemes for nonlinear diffusion equations with drift and the crowd motion model. The main result of this presentation is the convergence of the scheme to the corresponding PDEs as the time and space discretization vanishes.
- 15h00 Nicola Meunier
  - TBD