Seminars

For any questions, please contact Nina Otter.

The seminar takes place at the Laboratoire de Mathématiques d’Orsay, usually in room 2L8 (unless otherwise specified in the weekly announcement), and is also broadcast through our BigBlueButton room and recorded.

The recordings of all talks between autumn 2021 and May 2025 can be found on our previous BigBlueButton room. The recordings of talks from earlier years can be found on our YouTube Channel.

Weekly announcements of talks are sent through our mailing list. Please write to Nina Otte r to be added to the list. The talks are also advertised on the events page of the Laboratoire de Mathématiques d’Orsay.

2024-2025

25/09/2024	António Leitão (Scuola Normale Superiore di Pisa)
09/10/2024	Musashi Koyama (Australian National University)
09/10/2024	Marina Meila (University of Washington)
16/10/2024	Marzieh Eidi (Max Planck Institute for Mathematics in the Sciences)
23/10/2024	Adi Onus (Queen Mary University of London)
04/12/2024	Francesco Conti (Inria Sophia Antipolis)
18/12/2024	Claire Brécheteau (Ecole Centrale Nantes)
05/02/2025	Rémi Vaucher (Université Lyon 2 and Halias Technologies)
05/02/2025	Ambrose Yim (Cardiff University)
12/02/2025	Jisu Kim (Seoul National University)
19/02/2025	Olympio Hacquard (Kyoto University)
05/03/2025	Alexander Soen (Australian National University)
10/03/2025	Ondřej Draganov (Institute of Science and Technology Austria)
26/03/2025	Charly Boricaud (Université Paris-Saclay)
07/05/2025	Mikael Vejdemo-Johansson (CUNY)
25/06/2025	Jérôme Taupin (Inria-Saclay)
02/07/2025	Ziyad Oulhaj (Université Nantes)

25/09/2024 (11h00 CET)
Speaker: António Leitão (Scuola Normale Superiore di Pisa)
Title: Topological Expressive Power of Neural Networks

How many different problems can a neural network solve? Well, what makes two problems different? In this talk, we’ll show how Topological Data Analysis (TDA) can be used to partition classification problems into equivalence classes, and how the complexity of decision boundaries can be quantified using persistent homology. Then we will look at a network’s learning process from a manifold disentanglement perspective. We’ll demonstrate why analyzing decision boundaries from a topological standpoint provides clearer insights than previous approaches. We use the topology of the decision boundaries realized by a neural network as a measure of a neural network’s expressive power. We show how such a measure of expressive power depends on the properties of neural networks’ architectures, like depth, width and other related quantities.

The talk is based on joint work with Giovanni Petri, available at: https://openreview.net/pdf?id=I44kJPuvqPD

09/10/2024 (11h00 CET)
Speaker: Musashi Koyama (Australian National University)
Title: Computing degree-1 Vietoris-Rips persistent homology more efficiently

Vietoris-Rips persistent homology is a widely used type of persistent homology to analyse the shape of point clouds. In particular, degree-1 Vietoris-Rips persistent homology is useful for detecting loop structures in space, but comes with the drawback of being computationally too expensive to apply to the large data sets encountered in the modern world.

Ripser is currently one of the most widely utilised options for computing degree-1 Vietoris-Rips persistent homology, but typically struggles with analysing large point clouds due to memory limitations.

We present a modified version of the standard reduction algorithm for point clouds in Euclidean space and show the results for code optimised to compute degree-1 persistent homology for point clouds in 2 and 3 dimensional Euclidean space.

09/10/2024 (13h30 CET)
Speaker: Marina Meila (Washington University)
Title: Manifold Learning between Mathematics and the Sciences

This talk will extend Manifold Learning in two directions.

First, we ask if it is possible, in the case of scientific data where quantitative prior knowledge is abundant, to explain a data manifold by new coordinates, chosen from a set of scientifically meaningful functions?

Second, we ask how popular Manifold Learning tools and their applications can be recreated in the space of vector fields and flows on a manifold. For this, we need to transport= advanced differential geometric and topological concepts into a data-driven framework.

Central to this approach is the order 1-Laplacian of a manifold, Δ1, whose eigen-decomposition into gradient, harmonic, and curl, known as the Helmholtz-Hodge Decomposition, provides a basis for all vector fields on a manifold. We present an estimator for Δ1, and based on it we develop a variety of applications. Among them, visualization of the principal harmonic, gradient or curl flows on a manifold, smoothing and semi-supervised learning of vector fields, 1-Laplacian regularization. In topological data analysis, we describe the 1st-order analogue of spectral clustering, which amounts to prime manifold decomposition. Furthermore, from this decomposition a new algorithm for finding shortest independent loops follows. The algorithms are illustrated on a variety of real data sets.

Joint work with Yu-Chia Chen, Samson Koelle, Vlad Murad, Weicheng Wu, Hanyu Zhang and Ioannis Kevrekidis.

16/10/2024 (11h00 CET)
Speaker: Marzieh Eidi (Max Planck Institute for Mathematics in the Sciences)
Title: Topology as Fluid Geometry: from Theory to Applications

In this talk, I would like to present a view of how random walks can serve as the bridge between quantitative features of data, which are determined by geometric tools such as curvature, and qualitative features that can be detected by topological methods. After presenting some of the main ideas of geometric and topological data analysis via random walks on graphs, I will talk about the main challenges of extending these ideas to higher dimensions and I will discuss the known results as well as open problems in both theory and applications.

23/10/2024 (11h00 CET)
Speaker: Adi Onus (Queen Mary University of London)
Title: Local systems for periodic data

Periodic point clouds naturally arise when modelling large homogenous structures like crystals. They are naturally attributed with a map to a d-dimensional torus given by the quotient of translational symmetries, however there are many surprisingly subtle problems one encounters when studying their (persistent) homology. It turns out that bisheaves are a useful tool to study periodic data sets, as they unify several different approaches to study such spaces. The theory of bisheaves and persistent local systems was recently introduced by MacPherson and Patel as a method to study data with an attributed map to a manifold through the fibres of this map. The theory allows one to study the data locally, while also naturally being able to appeal to local systems of (co)sheaves to study the global behaviour of this data. It is particularly useful, as it permits a persistence theory which generalises the notion of persistent homology. In this talk I will present recent work on the theory and implementation of bisheaves and local systems to study the (persistent) homology of periodic cellular complexes.

04/12/2024 (11h00 CET)
Speaker: Francesco Conti (Inria Sophia Antipolis)
Title: Extending the equivariances of topological data analysis

The recently found synergy of Topological Data Analysis (TDA) and artificial intelligence is proving effective in many real-world scenarios. In this talk, we are going to present a new approach for the study of the shape of data by means of operators that originate from TDA and that we call Group Equivariant Non-Expansive Operators (GENEOs). The key idea of GENEOs is to allow TDA to be equivariant with respect to a subgroup of all homeomorphisms, potentially expanding the flexibility of the TDA framework. After introducing the key concepts of this mathematical setting, we are going to show that the computation of the persistence diagram is actually a GENEO. We conclude with a couple of real-world applications.

18/12/2024 (11h CET)

Speaker: Claire Brécheteau

Title: Statistical tests for uniformity and iidness on homogeneous spaces

I will introduce two families of statistical tests aiming at testing uniformity of samples of data points on homogeneous compact Polish spaces. Such tests are based on the computation of nearest neighbours distances, as in [1]. Such tests are consistent and come with parametric separation rates. I will show numerical results on the flat torus, the circle, the sphere, and a subset of the Poincaré disk. In particular, the tests will be compared to classical tests on the sphere and the circle [2].

[1] Brécheteau, A statistical test of isomorphism between metric-measure spaces using the distance-to-a-measure signature, 2019

[2] Garciá-Portugués Verdebout, An overview of uniformity tests on the hypersphere, 2018

05/02/2025 (11h CET)

Speaker: Rémi Vaucher (Université Lyon 2 and Halias Technologies)

Title: Using Signature theory for the creation of Simplicial Complexes on time evolving Signals

The theory of signatures, developed by K.T. Chen in the 1950s, studies the geometry of paths through iterations of the Stieltjes integral. This tool, originally rooted in pure differential geometry, was later introduced into probability theory and subsequently into machine learning by Terry Lyons.

Initially used in rough path theory, it gained some recognition for its application in extracting geometric features for machine learning, reaching outstanding results.

In this presentation, we will first examine the signature of a rough path and then explore how its remarkable properties can be leveraged to construct a simplicial complex that reflects explainability within a family of time series signals. Initially applied to univariate signals, we will see how this method can be extended to more complex data, such as multivariate signals with non-homogeneous dimensions.

05/02/2025 (14h CET)

Speaker: Ambrose Yim (Cardiff University)

Title: Coverings, Groupoids, and TDA

We illustrate an application of groupoids to topological data analysis. We consider a simplicial complex K with a map f: K -> M to a compact Riemannian manifold, equipped with its universal covering (e.g. a flat torus obtained by a quotient of the Euclidean plane). We discuss how a groupoid formalism allows us to computationally infer the induced homomorphism between the fundamental groups of K and M in terms of the group action of the universal covering. Since homology is the more computationally accessible invariant, we also show that we can obtain the induced homomorphism on H1 using a natural isomorphism between first singular homology and the first homology of the fundamental groupoid.

We consider an application of this framework to point clouds in a compact Riemannian manifold, where we wish to deduce which cycles in a complex built on the point cloud correspond to “ambient” cycles in the manifold. This relies on associating a choice of minimising geodesics on M for each edge in the complex. We show that there is an open dense subset of the configuration space of the point cloud where this association can be uniquely made. We use this set up to empirically analyse and interpret the first principal persistence measures on different compact surfaces.

12/02/2025 (11h CET)

Speaker: Jisu Kim (Seoul National University)

Title: Featurization and evaluation using Topological Data Analysis

Topological Data Analysis (TDA) generally refers to utilizing topological features from data. One main focus in TDA is persistent homology, which observes data at various resolutions and summarizes topological features that persistently appear. TDA has been proven valuable in enhancing machine learning applications. This presentation focuses on the application of TDA in machine learning, specifically in two aspects: featurization and evaluation.

The intricate structure of persistent homology poses challenges when directly applied to statistical or machine learning frameworks. To overcome this, the persistent homology is often featurized in Euclidean space or functional space. Three papers will be discussed as examples. First, I will present ”PLLay: Efficient Topological Layer based on Persistence Landscapes”, where I will explain how persistence landscapes are used to create a topological layer in a deep learning framework. Then, I will present “ECLayr: Fast and Robust Topological Layer based on Differentiable Euler Characteristic Curve”, which uses Euler Characteristic Curve to boost up the computation compared to PLLay. I will also present ”Generalized Penalty for Circular Coordinate Representation”, discussing how circular coordinates are utilized for visualization and dimension reduction.

Recently, efforts have emerged in using TDA to evaluate data or models and integrate them into machine learning models. I will present “TopP&R: Robust Support Estimation Approach for Evaluating Fidelity and Diversity in Generative Models”, where the confidence of TDA is employed for robust and reliable evaluation metrics for generative models.

19/02/2025 (11h CET)

Speaker: Olympio Hacquard (Kyoto University)

Title: Hypergraph clustering using Ricci curvature: an edge transport perspective

Hypergraphs are a generalization of graphs where hyperedges are allowed to connect an arbitrary number of nodes. This provides a much more faithful representation in many real-life cases. We tackle the problem of node partitioning by extending the notion of Ricci curvature on graphs to hypergraphs. We introduce a novel method where we transport measures defined on the hyperedges such that nodes assigned to different communities have a high transportation distance. We extensively compare this method with a similar notion of Ricci flow defined on the clique expansion, demonstrating its enhanced sensitivity to the hypergraph structure, especially in the presence of large hyperedges. The two methods are complementary and together form a powerful and highly interpretable framework for community detection in hypergraphs.

05/03/2025 (11hCET)

Speaker: Alexander Soen (Australian National University)

Title: Divergences and exponential families: applications to non-standard learning paradigms

Divergences and exponential families play a crucial role in machine learning. Beyond their ability to characterize standard supervised learning, they also serve as a foundation for designing algorithms that promote fairness and model abstention. In this talk, I will illustrate how supervised learning can be framed within the context of exponential families, enabling an analysis of the Fisher Information Matrix (FIM) of probability distributions induced by neural networks. Practically, this leads to the question of how best to estimate the FIM, for which we provide a decomposition of the variance of the two most popular estimators. Expanding beyond traditional supervised learning, I will explore a post-hoc fairness correction method based on divergence tilting. Finally, I will present a novel perspective on the learning-to-reject paradigm, where rejection is framed as thresholding exponential family density ratios within a divergence-regularized objective.

10/03/2025 (11h CET)

Speaker: Ondřej Draganov (Institute of Science and Technology Austria)

Title: Chromatic TDA and six-packs of persistent diagrams

Persistent homology, among other uses, is a popular tool to quantify spatial distribution of sets of points in Euclidean space, e.g., positions of cells on a tissue slide or atoms in a rigid material. In these examples, it is natural to consider additional information like a cell type or an element of the atoms. This motivates generalization of the standard methods to chromatic point sets — sets of points together with a color for each of them. The goal is to quantify spatial interaction of the different colors. Our approach takes advantage of variants of persistent homology induced by a map — image, kernel and cokernel persistent homology. Each yields a persistent diagram, and together with diagrams for domain, codomain and the relative homology, we get a six-pack of persistent diagrams.

In my talk, I will first introduce the six-pack of persistent diagrams, give intuitive description of the features it captures, and describe how to use a fitting discretization, chromatic alpha complexes, to efficiently compute them. Then I will discuss how the arrangement of the diagrams can also be useful for theoretical questions, and showcase this with a problem about ratios of lengths of Euclidean minimum spanning trees.

26/03/2025 (11h CET)

Speaker: Charly Boricaud (Université Paris-Saclay)

Title: A varifold-type estimation for data sampled on a rectifiable set

Assuming that we have access to i.i.d. samples in R^n obtained from an underlying d-dimensional shape S endowed with a possibly non uniform density θ, we propose and analyse an estimator of the varifold structure associated to S. The shape S is assumed to be piecewise C^{1,a} in a sense that allows for a singular set whose small enlargements are of small d-dimensional measure. The estimators are kernel-based both for infering the density and the tangent spaces and the convergence result holds for the bounded Lipschitz distance between varifolds, in expectation and in a noiseless model. The mean convergence rate involves the dimension d of S, its regularity through a∈(0,1] and the regularity of the density θ.

07/05/2025 (11h CET)

Speaker: Mikael Vejdemo-Johansson (City University of New York)

Title: Mapper failure modes and multiple hypothesis testing

Abstract: The Mapper algorithm is one of several core techniques in Topological Data Analysis, and a major part of the power it derives in applications is its relatively high interpretability. However, for a formally reliable notion of interpretability, we would rely on some version of the Nerve lemma. Failing to check that the refined cover generating the Mapper complex can generate arbitrarily large changes to the topology of the complex, which is a cause to doubt specific interpretations of the complex.

In order to provide a statistical certification of the quality of a Mapper complex analysis, we would need to test every subset of the data associated with one of the Mapper simplices for a lack of topological structure. This places us solidly in the domain of multiple hypothesis testing, and a naïve approach is likely to lead to erroneous conclusions. We propose a method based on a generative model for feature-less data and simulation testing across all Mapper simplices as an aggregate. This produces a method with controlled family-wise rejection errors, as we can demonstrate in validation simulations.

25/06/2025 (11h CET)

Speaker: Jérôme Taupin (Inria-Saclay)

Title: Estimation of measure-based conformal metrics

Abstract: When considering high-dimensional objects, the Euclidean distance is not always the most suitable for assessing the actual distance between data points. It is of interest to study conformal deformations of the Euclidean metric, using functions that take the distribution of the data into account to provide a metric that represents better the geometry and statistical properties of this data. The Fermat distance is an example of such metric that deforms space by bringing points closer together in areas of high density. However its theoretical analysis is complex and it is by definition restricted to measures with density.

In this talk I will discuss these limitations and introduce a variant of the Fermat distance defined for any measure that possesses strong stability and estimation properties. I will also discuss conformal metrics in general and their regularity in order to provide estimation results.

02/07/2025 (11h CET)

Speaker: Ziyad Oulhaj (Université Nantes)

Title: Gromov-Wasserstein bound between Reeb and Mapper graphs

Abstract: Since its introduction as a computable approximation of the Reeb graph, the Mapper graph has become one of the most popular tools from topological data analysis for performing data visualization and inference. However, finding an appropriate metric for comparing Reeb and Mapper graphs, in order to, e.g., quantify the rate of convergence of the Mapper graph to the Reeb graph, is a difficult problem. We handle this issue by treating Reeb and Mapper graphs as metric measure spaces. This allows us to use Gromov-Wasserstein metrics to compare these graphs directly in order to better incorporate the probability measures that data points are sampled from.

Previously

2023-2024

08/11/2023	Luis Scoccola
09/11/2023	Vincent Divol
15/11/2023	Bartek Blaszczyszyn
22/11/2023	François Petit
30/11/2023	Katharine Turner
13/12/2023	Wolfgang Polonik
21/12/2023	Céline Duval
10/01/2024	Charles Arnal
24/01/2024	Vincent Divol
31/01/2024	Henrique Ennes and Raphaël Tinarrage
08/02/2024	Eric Goubault
13/03/2024	Alexandra Rören and Adrien Beaud
20/03/2024	Sara Kalisnik and Miguel O’Malley

08/11/2023 (11h00 CET)
Speaker: Luis Scoccola
Title: What do we want from invariants of multiparameter persistence modules?

Various constructions relevant to practical problems such as clustering and graph classification give rise to multiparameter persistence modules (MPPM), that is, linear representations of non-totally ordered sets. Much of the mathematical interest in multiparameter persistence comes from the fact that there exists no tractable classification of MPPM up to isomorphism, meaning that there is a lot of room for devising invariants of MPPM that strike a good balance between discriminating power and complexity of their computation. However, there is no consensus on what type of information we want these invariants to provide us with, and, in particular, there seems to be no good notion of “global” or “high persistence” features of MPPM.
With the goal of substantiating these claims, as well as making them more precise, I will start with an overview of the theory of multiparameter persistence, including joint works with Bauer and Oudot. I will then briefly outline recent work of Bjerkevik, which contains relevant open questions and which will help us make sense of the notion of global feature in multiparameter persistence.

09/11/2023 (15h45 CET)
Speaker: Vincent Divol
Title: Spectral estimation of the Laplace-Beltrami operator

Graphs Laplacians are used for various tasks in machine learning, e.g. for spectral clustering, or to find adapted bases of eigenfunctions that can be used as feature maps. In the case where the underlying graph is a neighborhood graph built on top of n i.i.d. points sampled on a submanifold M, the discrete Laplacian operator is known to converge towards a weighted Laplace operator on M. We focus on exhibiting rates of convergence for the eigenvalues of the operators. Namely, if the density p is of regularity s (large enough) and the manifold is of dimension d, we show that the estimation of the eigenvalues of the p-weighted Laplace operator can be done as quickly as the estimation of the density itself, with rates of order n^(-s/(2s+d)). We are also able to show that this rate is minimax optimal in the case of a curve (d=1). Joint work with Clément Berenfeld and Yann Chaubet.

22/11/2023 (11 CET)
Speaker: François Petit (CRESS, Inserm)
Title: Projected barcodes and distances for multi-parameter persistence modules

In this talk, I will present the notion of projected barcodes and projected distances for multi-parameter persistence modules. Projected barcodes are defined as derived pushforward of persistence modules onto R and provide descriptor for multiprameter persistence modules. Projected distances come in two flavors: the integral sheaf metrics (ISM) and the sliced convolution distances (SCD).

In the case where the persistence module considered is the sublevel-sets persistence modules of a function f : X -> R^n, we will explain how, under mild conditions, the projected barcode of this module by a linear map u : R^n \to R is the collection of sublevel-sets barcodes of the composition uf . In particular, it can be computed using software dedicated to one-parameter persistence modules. This is joint work with Nicolas Berkouk.

30/11/2023 (11 CET)
Speaker: Katharine Turner (Australian National University)
Title: Representing Vineyard Modules

Time-series of persistence diagrams, more commonly known as vineyards, are a useful way to capture how multi-scale topological features vary over time. However, as the persistent homology is calculated and considered at each time step independently we do lose significant information in how the individual persistent homology classes evolve over time. A natural algebraic version of vineyards is a time series of persistence modules equipped with interleaving maps between the persistence modules at different time stamps. Let’s call this a vineyard module. I will set up the framework for representing a vineyard module via an indexed set of vines alongside a collection of matrices. Furthermore I will outline an algorithmic way to transform the bases of the persistence modules at each time step within the vineyard module to make the matrices within this representation as simple as possible. With some reasonable assumptions (analogous to those in Cerf theory) on the vineyard modules, this simplified representation can be completely described (up to isomorphism) by the underlying vineyard and a vector of finite length. While this vector representation is not in general guaranteed to be unique we can prove that it will be always zero when the vineyard module is isomorphic to the direct sum of vine modules.

13/12/2023 (11 CET)
Speaker: Wolfgang Polonik (UC Davis)
Title: Inference in Topological Data Analysis

This talk presents some novel contributions to statistical inference for Topological Data Analysis (TDA). The presented inference methods consist of bootstrap based confidence regions for (persistent) Betti numbers and Euler characteristic curves. In contrast to most of the other existing inference methods for TDA, our methods are based on one data set of size n, and large sample guarantees are thus established for n tending to infinity. The presented results provide insights into how the sampling distribution affects the persistence diagram. This is joint work with Benjamin Roycraft and Johannes Krebs.

21/12/2023 (3.45pm CET) (joint with Probability and Statistics seminar)
Speaker: Céline Duval (Université de Lilles)
Title: Geometry of excursion sets: computing the surface area from discretized points

The excursion sets of a smooth random field carries relevant information in its various geometric measures. After an introduction of these geometrical quantities showing how they are related to the parameters of the field, we focus on the problem of discretization. From a computational viewpoint, one never has access to the continuous observation of the excursion set, but rather to observations at discrete points in space. It has been reported that for specific regular lattices of points in dimensions 2 and 3, the usual estimate of the surface area of the excursions remains biased even when the lattice becomes dense in the domain of observation. We show that this limiting bias is invariant to the locations of the observation points and that it only depends on the ambient dimension. (based on joint works with H. Biermé, R. Cotsakis, E. Di Bernardino and A. Estrade).

10/01/2024 (11am CET)
Speaker: Charles Arnal (Inria-Saclay)
Title: Critical points of generic submanifolds

[Work by David Cohen-Steiner, Vincent Divol, and Charles Arnal]

Given a closed set S in Euclidean space, one can consider the (generalized) gradient of the distance function to S. We are interested in the study of this gradient, and especially in the (generalized) critical points Z(S) of the distance function, which play an important role in computational geometry, and in particular in the study of the Cech complex of S. In general, the set of critical points Z(S) can be poorly behaved; however, we show that Z(S) becomes very regular and stable when S=M is a generically embedded submanifold. More precisely, for any compact abstract manifold M, the set of embeddings of M into a Euclidean space such that their image satisfies our strong regularity and stability conditions is open and dense in the Whitney C2-topology. When those conditions are satisfied, the distance function to the embedded submanifold essentially behaves like a Morse function in terms of the homotopy type of its sublevels. In this talk, I will detail this result and explain the core ideas of the proof, including our application of Thom’s transversality theorem. In his talk on the 24th of January, my coauthor Vincent Divol will detail the consequences of these results in terms of diagrams of subsamples of a generic submanifold.

24/01/2024 (11am CET)
Speaker: Vincent Divol (Université PSL)
Title: Wasserstein convergence of persistence diagrams on generic manifolds

Persistence diagrams (PDs) allow one to describe the topology of a point cloud in a multiscale fashion. When the point cloud is sampled on an m-dimensional submanifold M of R^d, the bottleneck stability theorem ensures that the the PD of the point cloud is close with respect to the bottleneck distance to the PD of the underlying manifold M. However, closeness with respect to this distance does not give any indication on the structure of the points in the PDs that are close to the diagonal. For a generic submanifold M, we are able to give a precise analysis on the shape of these points by providing laws of large numbers. This generalizes previous results by Wolfgang Polonik and me in the case of points sampled in the cube [0,1]^m. (Joint work with Charles Arnal and David Cohen-Steiner)

31/01/2024 (11am CET)
Speaker: Henrique Ennes (Inria Sophia Antipolis) and Raphaël Tinarrage (FGV EMAp)
Title: Detection of representation orbits of compact Lie groups on point clouds

Arguably, the most impactful realization in early last-century quantitative sciences was the combination of the loose notions of symmetry with the precise definition of algebraic groups. From important results in cryptography, passing through the description of the hydrogen atom and the standard model of particles, arriving at the formalization of Klein’s Erlanger program, or even visual psychology, the group-theoretic point of view on symmetries allowed for unprecedented developments in many theoretical and practical fields. In particular, in Machine Learning, there is a long history of interest in detecting actions of groups. For instance, identifying symmetries of the special Euclidean group SE(n) has been tackled since the 1980s for planar data, and more recently 3D objects. Other Lie groups have also been studied, Abelian or not. Once the information of acting Lie groups is determined, it can be used to increase the efficiency of other inference and machine learning tasks, through the use of equivariant algorithms. However, the methods of representation detection found in the literature do not address the question of estimating the exact representation type, that is, its decomposition into irreducible representations. Indeed, the representation is often found through its Lie algebra, which is approximated as if it were a linear subspace. Consequently, the information regarding the commutators is not exact, and the approximation may not be stable by Lie bracket. In this work, we tackle the question of projecting onto the closest Lie algebra, as raised by Cahill, Mixon, and Parshall, through employing tools from optimization on matrix manifolds. Namely, the manifolds involved are the Stiefel and Grassmannian of Lie subalgebras. We propose a careful analysis of our algorithm, allowing to obtain precise theoretical guarantees for the convergence of the optimization. By bridging the gap between compact Lie groups representations and symmetry detection in practice, our algorithm allows the reconstruction of the orbits, and the identification of the group that generates the action. Our algorithm is general for any compact Lie group, but we implement it and focus specifically on SO(2), T^d, SU(2), and SO(3). We illustrate its accuracy in the context of three data science problems, including image analysis, harmonic analysis, and classical mechanics systems.

08/02/2024 (11am CET)
Speaker: Eric Goubault (LIX)
Title: Directed homology and persistence modules

In this talk, we will try to give a self-contained account on a construction for a directed homology theory based on modules over algebras, linking it to both persistence homology and natural homology, that was originally proposed as a convenient directed homology theory, based on natural systems of modules.

Persistence modules have been introduced originally for topological data analysis, where the data set seen at different « resolutions » is organized as a filtration of spaces. This has been further generalized to multidimensional persistence and « generalized » persistence, where a persistence module was defined to be any functor from a partially ordered set, or more generally a preordered set, to an arbitrary category (in general, a category of vector spaces).

Directed topology has its roots in a very different application area, concurrency and distributed systems theory rather than data analysis. Its goal is to study (multi-dimensional) dynamical systems, discrete (precubical sets, for application to concurrency theory) or continuous time (differential inclusions, for e.g. applications in control), that appear in the study of multi-agent systems. In this framework, topological spaces are « directed », meaning they have « preferred directions », for instance a cone in the tangent space, if we are considering a manifold, or the canonical local coordinate system in a precubical set. Natural homology, an invariant for directed topology, defines a natural system of modules, a further categorical generalization of (bi)modules, describing the evolution of the standard (simplicial, or singular) homology of certain path spaces, along their endpoints. Indeed, this is, in spirit, similar to persistence homology.

This talk will be concerned with a more « classical » construction of directed homology, mostly for precubical sets here, based on (bi)modules over (path) algebras, making it closer to classical homology with value in modules over rings, and of the techniques introduced for persistence modules. Still, this construction retains the essential information that natural homology is unveiling. Of particular interest will be the role of restriction and extension of scalars functors, that will be central to the discussion of relative homology and Mayer-Vietoris sequences. If time permits as well, we will discuss a Kunneth formula and some « tameness » issues, for dealing with practical calculations.

13/03/2024 (10am CET)

Speaker: Alexandra Rören and Adrien Beaud (CRESS, Inserm)

Title: Exploring methods for shoulder movement quality assessment using IMUs, in individuals with Subacromial Pain Syndrome

20/03/2024 (1.30pm CET)

Speaker: Sara Kalisnik (ETH)

Title: Persistent Homology for Ellipsoid Complexes

(the first part is joint work with Davorin Lesnik, the second part is joint work with Bastian Rieck and Ana Zegarac)

A seminal result by Niyogi, Smale and Weinberger states that if a sample of a closed smooth submanifold of a Euclidean space is dense enough (relative to the reach of the manifold), there exists an interval of radii, for which the union of closed balls around sample points deformation retracts to the manifold. A tangent space is a good local approximation of a manifold, so we can expect that an object, elongated in the tangent direction, will better approximate the manifold than a ball. I will briefly review a result we proved with Davorin Lešnik that the union of ellipsoids of suitable size around sample points deformation retracts to the manifold while requiring much smaller density than in the case of union of balls. Then I will focus on work-in-progress with Ana Zegarac and Bastian Rieck on implementing ellipsoid complexes, and in particular, I will share some of the preliminary results.

20/03/2024 (3pm CET)

Speaker: Miguel O’Malley (MPI MiS)

Title: Alpha magnitude and dimension

(joint work with Sara Kalisnik and Nina Otter)

Magnitude, an isometric invariant of metric spaces, is known to bear rich connections to other desirable invariants, such as dimension, volume, and curvature. Connections between magnitude and persistent homology, a method to observe topological features in datasets, are well studied and fruitful. We leverage one such connection, persistent magnitude, to introduce alpha magnitude, a new invariant which bears many of the same properties of magnitude. We show in particular a strong connection to the Minkowski dimensions of compact subspaces of R^n and conjecture the connection exists in general.

2022-2023

15/06	Laurent Oudre
25/05	Christophe Pichon
04/05	Clément Maria
~~06/04~~ 08/06	Clément Levrard
~~23/03~~ 30/03	Jae-Hun Jung
16/03	Olympio Hacquard
09/03	Umut Şimşekli
23/02	Ximena Fernandez
~~16/02~~ 11/05	Michael Ghil, Denisse Sciamarella
26/01	Laure Ferraris
05/01	Claire Brécheteau
17/11	Wolfgang Polonik
~~10/11~~ 8/12	Alice le Brigant

For the duration of the thematic trimester GESDA at IHP, the regular seminars were paused.

15/06 (11h00 CEST)
Speaker: Laurent Oudre
Title: Étude de la marche grâce à des centrales inertielles : du traitement du signal à l’analyse topologique des données

Cet exposé reviendra sur le projet SmartCheck, qui vise à quantifier et étudier la marche en consultation grâce à des centrales inertielles. Nous présenterons un panorama des différentes approches utilisées au fil des années pour analyser ces données et pouvoir comparer deux enregistrements à des fins de comparaison inter-indivuelle et de suivi longitudinal. Les premières méthodes, issues du traitement du signal visaient à localiser les événements d’intérêt et les changements de comportements, grâce à des techniques de reconnaissance de formes et de détection de ruptures. Plus récemment, nous nous sommes demandés si ces étapes étaient nécessaires, et si une approche plus globale issue de l’analyse topologique des données ne permettrait pas de construire une distance adaptée entre enregistrements.

25/05 (11h00 CEST)
Speaker: Christophe Pichon
Title: DisPerSE: automatic identification of persistent structures in 2 and 3D complexes

DisPerSE stands for Discrete Persistent Structures Extractor. Its main purpose is to identify persistent topological features such as peaks, voids, walls and in particular filamentary structures within sampled distributions in 2D and 3D. It was developed with observational and numerical cosmology in mind (for the study of the so-called comic web in the Universe). I will present a few interesting applications and related theoretical breakthroughs from astrophysics.

04/05 (11h00 CEST)
Speaker: Clément Maria
Title: Aspects of Algorithmic Quantum Topology

In this talk, I will introduce notions from quantum topology through the algorithmic and computational complexity lenses. Quantum topology uses tools from mathematical physics to design topological invariants of knots and 3-manifolds. Computationally, the algorithmic complexity of these invariants are in connection with the complexity class #P of counting problems, and the quantum complexity class BQP. More recently, efficient and often optimal (under #ETH) algorithms have been designed using methods from parameterized complexity, where, in this framework, parameters may be of combinatorial or topological nature. I will try to give an overview of the state of the art and the future challenges of the field.

~~06/04~~ 08/06 (15h45 CEST)
Speaker: Clément Levrard
Title: Estimation optimale du Reach via estimation de métrique

Dans un contexte d’inférence géométrique, où le but est d’approcher le support d’une loi de probabilité P à partir d’un échantillon, le reach du support est un paramètre clé: il fournit une échelle locale en dessous de laquelle le support peut être considéré comme “plat”, et intervient de ce fait dans les vitesses d’estimation de support ainsi que dans les algorithmes de reconstruction effectifs de ce support.

Dans ce exposé je présenterai les différentes caractérisations de ce reach, ainsi que ses liens avec d’autres types d’échelles locales utilisées en géométrie algorithmique, certaines pouvant être estimées, d’autres non. Je conclurai en présentant les résultats obtenus avec Eddie Aamari et Clément Berenfeld, à savoir que la caractérisation métrique du reach (résultat relativement récent de Jean-Daniel Boissonnat, André Lieutier et Mathijs Wintraecken) fournit une voie d’accès “optimale” pour l’estimation du reach (au sens minimax).

~~23/03~~ 30/03 (11h00 CEST)
Speaker: Jae-Hun Jung
Title: Topological data analysis of time-series data

Time-series data analysis is found in various applications that deal with sequential data over the given interval of, e.g. time. In this talk, we discuss time-series data analysis based on topological data analysis (TDA). The commonly used TDA method for time-series data analysis utilizes the embedding techniques such as sliding window embedding. With sliding window embedding the given data points are translated into the point cloud in the embedding space and the method of persistent homology is applied to the obtained point cloud. In this talk, we first show some examples of time-series data analysis with TDA. The first example is from music data for which the dynamic processes in time is summarized by low dimensional representation based on persistence homology. The second is the example of the gravitational wave detection problem and we will discuss how we concatenate the real signal and topological features. Then we will introduce our recent work of exact and fast multi-parameter persistent homology (EMPH) theory. The EMPH method is based on the Fourier transform of the data and the exact persistent barcodes. The EMPH is highly advantageous for time-series data analysis in that its computational complexity is as low as O(N log N) and it provides various topological inferences almost in no time. The presented works are in collaboration with Mai Lan Tran, Chris Bresten and Keunsu Kim.

16/03 (11h00 CET)
Speaker: Olympio Hacquard
Title: Statistical learning on measures: an application to persistence diagrams

We consider a binary supervised learning classification problem where instead of having data in a finite-dimensional Euclidean space, we observe measures on a compact space $\mathcal{X}$. Formally, we observe data $D_N = (\mu_1, Y_1), \ldots, (\mu_N, Y_N)$ where $\mu_i$ is a measure on $\mathcal{X}$ and $Y_i$ is a label in ${0, 1}$. Given a set $\mathcal{F}$ of functions on $\mathcal{X}$, we build corresponding classifiers in the space of measures, and we provide upper and lower bounds on the Rademacher complexity of this new class of classifiers that can be expressed simply in terms of corresponding quantities for the class $\mathcal{F}$. If the measures $\mu_i$ are uniform over a finite set, this classification task boils down to a multi-instance learning problem. However, our approach allows more flexibility and diversity in the input data we can deal with. While such a framework has many possible applications, this work strongly emphasizes on classifying data via topological descriptors called persistence diagrams. These objects are discrete measures on $\mathbb{R}^2$, where the coordinates of each point correspond to the range of scales at which a topological feature exists. We will present several classifiers on measures and show how they can heuristically and theoretically enable a good classification performance in various settings in the case of persistence diagrams.

09/03 (15h45 CET)
Speaker: Umut Şimşekli
Title: Generalization in SGD: Heavy-Tails and Fractal Structure

In this talk, we will first derive a generalization bound for SGD under the assumption that it can be well-modeled by a heavy-tailed stochastic differential equation. To do so, we will use tools from fractal geometry and geometric measure theory and make a connection between heavy-tails and generalization error through fractal geometry. Then, we will extend this framework by using the so called “persistent homology dimension”, which is a notion of fractal dimension that is defined through topological data analysis (TDA) notions. Thanks to this connection, we will show that generalization in neural networks can be efficiently computed by using TDA tools. This talk is based on the following papers: arXiv:2006.09313 and arXiv:2111.13171.

23/02 (11h00 CET)
Speaker: Ximena Fernandez
Title: Intrinsic persistent homology via density-based metric learning
Slides

In this talk, I will present a recent density-based method to address the problem of estimating topological features from data in high dimensional Euclidean spaces under the manifold assumption. The key of this approach is to consider a sample metric known as Fermat distance to robustly infer the homology of the space of data points . We prove that such sample metric space GH-converges almost surely to the manifold itself endowed with an intrinsic (Riemannian) metric that accounts for both the geometry of the manifold and the density that produces the sample. This fact, joint with the stability properties of persistent homology, implies the convergence of the associated persistence diagrams, which present advantageous properties. We show that they are robust to the presence of (geometric) outliers in the input data and less sensitive to the particular embedding of the underlying manifold in the ambient space. Finally, we exhibit concrete applications of these ideas to time series analysis, with examples in real data.

This is joint work with E. Borghini, G. Mindlin and P. Groisman, Intrinsic persistent homology via density-based metric learning, arXiv preprint arXiv: 2012.07621, 2020.

~~16/02~~ 11/05 (11h00 CET)
Speaker: Michael Ghil, Denisse Sciamarella
Title: Chaos, stochasticité et topologie algébrique : nouvelles méthodes et leur application au climat

Abstract here.

26/01 (11h00 CET)
Speaker: Laure Ferraris
Title: Imiter la lumière pour l’analyse topologique des données

Dans cet exposé, il sera question de “metric learning”.
Dans un premier temps je présenterai la famille des distances dites de Fermat. Ces distances définissent un plus court chemin entre deux points qui, à la manière de la lumière, traverse différents milieux plus ou moins rapidement. Interprétant un sous-ensemble de R^D comme le support d’une mesure de probabilité, dans ce modèle, le chemin ne sort pas du support et est accéléré lorsqu’il traverse des régions à forte densité. Ainsi la distance s’adapte d’une part, à la géométrie et d’autre part, à la distribution de la masse. Cet effet est modulable par un paramètre dont le choix, délicat, peut dépendre du contexte d’application mais aussi de contraintes statistiques ou encore du problème de la complexité numérique.
J’introduirai la famille des estimateurs de la distance Fermat et les premiers résultats sur ses propriétés statistiques. Enfin, j’évoquerai les questions ouvertes qui occupent un travail de recherche en cours en collaboration avec Matthieu Jonckheere, Facundo Sapienza, Pablo Groisman, Frédéric Pascal et Frédéric Chazal.
Dans un second temps, en guise d’ouverture, j’évoquerai mon projet de recherche sous la direction de Frédéric Chazal: “L’introduction d’une nouvelle distance entre deux points la DTM-Fermat et ses applications en analyse topologique des données”.

05/01 (15h45 CET)
Speaker: Claire Brécheteau (Université de Nantes)
Title: Approcher des données par une union de boules ou d’ellipsoïdes et partitionnement

Dans cet exposé, il sera question de construire un proxy de le fonction distance à un compact, à partir d’un nuage de points générés sur ce compact, avec du bruit.
Ce proxy sera construit à partir d’un critère de type k-means, avec une divergence de Bregman. Ses sous-niveaux seront des unions de boules. Je présenterai l’utilisation de ce proxy à des fins de partitionnement de données.
Il s’agit de travaux publiés dans :
– Claire Brécheteau and Clément Levrard, A k-points-based distance for robust geometric inference. Bernoulli 2020, Vol. 26, No. 4, 3017-3050
– Claire Brécheteau and Aurélie Fischer and Clément Levrard, Robust Bregman Clustering. Annals of Statistics 2021, Vol. 49, No. 3, 1679-1701
– Claire Brécheteau, Robust anisotropic power-functions-based filtrations for clustering.
Symposium on Computational Geometry 2020, 23:1-23:15

17/11 (15h45 CET)
Speaker: Wolfgang Polonik (UC Davis)
Title: Multiscale Geometric Feature Extraction

A method for extracting multiscale geometric features from a data cloud is presented. Each pair of data points is mapped into a real-valued feature function, whose construction is based on geometric considerations and the novel notion of a distribution of (local) data depths. The collection of these feature functions is then being used for further data analysis. In contrast to the popular kernel-trick, our feature functions are functions of a one-dimensional parameter, and thus they can be used to visualize geometric aspects of a high-dimensional data cloud. Besides visualization, applications include classification and anomaly detection. The performance of the methodology is illustrated through applications to real data sets, and some theoretical guarantees supporting the performance of the proposed methodology are presented. This is joint work with G. Chandler.

~~10/11~~ (15h45 CET) Rescheduled to 08/12
Speaker: Alice Le Brigant
Title: Fisher information geometry of Dirichlet distributions

The Fisher information can be used to define a Riemannian metric to compare probability distributions inside a parametric family. The most well-known example is the case of (univariate) normal distributions, where the Fisher information induces hyperbolic geometry. In this talk we will investigate the Fisher information geometry of Dirichlet distributions, and beta distributions as a particular case. We show that it is negatively curved and geodesically complete. This guarantees the uniqueness of the notion of mean distribution, and makes it a suitable geometry to apply the K-means algorithm, e.g. to compare and classify histograms.

2021-2022

30/06	Edouard Bonnet
23/06	Théo Lacombe
16/06	Quentin Mérigot & Mathijs Wintraecken & André Lieutier
02/06	Dmitriy Morozov
19/05	Thomas Bonis
28/04	Jean Feydy & Anna Song
21/04	Anthea Monod
07/04	Marc Hoffman
10/03	Jeremy Capitao
24/02	Vadim Lebovici
17/02	Nicolas Berkouk & Raphael Reinhauer
03/02	Mathieu Carriere
13/01	Olympio Hacquard
06/01	Catherine Aaron
09/12	Vadim Lebovici
25/11	Eduardo Mendes
18/11	Jonathan Spreer
14/10	Marine le Morvan
07/10	Etienne Lasalle
23/09	Daniel Perez

30/06 (11h00 CEST)
Speaker: Édouard Bonnet
Title: The geometry of twin-width

We introduce and survey the key features of twin-width, and more generally, of the so-called contraction sequences, insisting on what is and what’s not geometric or topological about them.

DataShape Seminar

Seminars

2024-2025

Previously

2023-2024

2022-2023

2021-2022

Previously