# Datashape Seminar

## Talks:

• Claire Brecheteau, Inria Saclay

“The k-PDTM : a coreset for robust geometric inference”

Analysing the sub-level sets of the distance to a compact sub-manifold of R^d is a common method in Topological Data Analysis to understand its topology.The distance to measure (DTM) was introduced by Chazal, Cohen-Steiner and Mérigot in 2011 to face the non-robustness of the distance to a compact set to noise and outliers. This function makes possible the inference of the topology of a compact subset of R^d from a noisy cloud of n points lying nearby in the Wasserstein sense. In practice, these sub-level sets may be computed using approximations of the DTM such as the q-witnessed distance by Guibas, Morozov and Mérigot (2013) or other power distance by Buchet, Chazal, Oudot and Sheehy (2016). These approaches lead eventually to compute the homology of unions of n growing balls, that might become intractable whenever n is large.

To simultaneously face the two problems of large number of points and noise, we introduce the k-power distance to measure (k-PDTM). This new approximationof the distance to measure may be thought of as a k-coreset based approximation of the DTM. Its sub-level sets consist in union of k-balls, k <<n, and this distance is also proved robust to noise. We assess the quality of this approximation for k possibly dramatically smaller than $n$, for instancek=n^{1/3} is proved to be optimal for 2-dimensional shapes. We also provide an algorithm to compute this k-PDTM.

• David Cohen-Steiner, Inria Sophia

“Approximating the Spectrum of Large Matrices joint” with Weihao Kong, Christian Sohler and Gregory Valiant

• Guilherme Dias da Fonseca, Université Clermont-Auvergne

“Approximate Polytope Membership Queries and Applications”

In the polytope membership problem, we are given a convex polytope P⊂Rd for constant d ≥ 2, and the objective is to preprocess P into a data structure so that, given any query point q ∈ Rd, it is possible to determine efficiently whether q P. We consider this problem in an approximate setting. Given an approximation parameter ε > 0, an ε– approximate polytope membership query returns a positive result if q P, a negative result if the distance from q to P is greater than ε·diam(P), and it may return either result otherwise.

We presented the first data structures especially tailored to approximate polytope membership. Initially, we showed how to answer queries in O(1/ε(d−1)/4) time using optimal O(1/ε(d−1)/2) storage. Later, we improved the analysis of the same data structure to O(log(1/ε)/ε(d−1)/8) query time for the same optimal storage. Switching to a different approach, we finally obtained an optimal data structure with O(log(1/ε)) query time and O(1/ε(d−1)/2) storage. Our data structures yield dramatic improvements to several well-studied geometric approximation problems, where the input is a set of n points and α >0 is an arbitrarily small constant:

• We reduce the storage needed to answer approximate nearest neighbor queries in polylogarithmic time from O(n/εd−1) to O(n/εd/2).
• We reduce the time to approximate the diameter from O((n+ 1/εd−2) log(1/ε)) to O(nlog(1/ε) + 1/ε(d−1)/2+α).
• We reduce the time to construct an ε-kernel from O((n+ 1/εd−2) log(1/ε)) to near-optimal O(nlog(1/ε) + 1/ε(d−1)/2+α).
• We reduce the expected time to approximate the bichromatic closest pair from O(n/εd/3) to O(n/εd/4+α).
• We reduce the expected time to approximate the minimum bottleneck Euclidean spanning tree from O((nlogn)/εd/3) to O((nlogn)/εd/4+α).

While our initial approach to approximate polytope membership was based on standard techniques of grids and quadtrees, our most efficient data structure presents the first algorithmic application of Macbeath regions, a classical structure from convex geometry. The data structure consists of a hierarchy of John ellipsoids of Macbeath regions, where each level of the hierarchy provides a certain degree of approximation of the boundary of the polytope.

• Vincent Divot, Inria Saclay

“The density of the expected persistence diagrams and its kernel-based estimation”

Persistence diagrams play a fundamental role in Topological Data Analysis where they are used as topological descriptors of filtrations built on top of data. They consist in discrete multisets of points in the plane R^2 that can equivalently be seen as discrete measures in R^2. When the data come as a random point cloud, these discrete measures become random measures whose expectation is studied in this paper. First, we show that for a wide class of filtrations, including the Čech and Rips-Vietoris filtrations, the expected persistence diagram, that is a deterministic measure on R^2 , has a density with respect to the Lebesgue measure. Second, building on the previous result we show that the persistence surface recently introduced in [Adams & al., Persistence images: a stable vector representation of persistent homology] can be seen as a kernel estimator of this density. We propose a cross-validation scheme for selecting an optimal bandwidth, which is proven to be a consistent procedure to estimate the density.

• Kunal Dutta, Inria Sophia

“On the Randomized Incremental Construction of Delaunay Triangulations of Epsilon-nets

The randomized incremental construction (RIC) paradigm for building geometric data structures introduced by Clarkson and Shor [Disc. & Comp. Geom., 1989] has found wide application in the field of computational geometry and related areas such as graphics, statistical inference and machine learning. It has been analyzed extensively from the point of view of worst-case distributions. In many practical situations however, we face nicer distributions. A natural question that arises is: do the usual RIC algorithms automatically adapt when the point samples are nicely distributed? We answer positively to this question for the case of the Delaunay triangulation of $\varepsilon$-nets. Under some additional assumptions, our proof works in geodesic metric spaces. A key component of our proof is a bound on the probabilities of certain non-monotone events under sampling without replacement, which may be of general interest.

Joint work with J.-D. Boissonnat, O. Devillers and M. Glisse.

• Arijit Ghosh, Indian Statistical Institute

“Parameterized Query Complexity of Hitting Set using Stability of Sunflowers”

In this paper, we study the query complexity of parameterized versions of Max Cut, Vertex Cover, and their generalizations.  We study the query complexity of both the decision version and optimization version of the above problems. In doing so, we use earlier query models and also design new ones. The query models considered are the BIS and BISE oracles:

(i) the BIS oracle takes as input two vertex subsets A, B in a graph G and answers whether there is an edge with one end point in A and the other endpoint in B,
(ii) the BISE oracle takes the same input and as an answer returns an edge that has one end point in A and another in B.
The BIS and BISE oracles are used for the decision and optimization versions of the problems, respectively. To study the generalizations of the above problems into hypergraphs, we introduce the GPIS and GPISE oracle models.

This a joint work with Arijit Bishnu (ISI, Kolkata), Sudeshna Kolay (TU, Eindhoven), Gopinath Mishra (ISI, Kolkata) and Saket Saurabh (IMSC, Chennai).

• Siargey Kachanovich, Inria Sophia

“Manifold reconstruction based on Coxeter triangulations of the ambient Euclidean space”

We present an experimental algorithm for manifold reconstruction, which takes as input a point cloud sampled from a manifold with positive reach and outputs a simplicial complex which is homotopy equivalent to the manifold. The time complexity and the space complexity are polynomial in ambient dimension and depend on the intrinsic dimension $k$ of the manifold as $O(2^k)$, which improves the $O(2^{k^2})$ time and space complexities of tangential complex. The algorithm is based on Coxeter triangulations, space-filling triangulations generated by a single simplex via consecutive flips. We have seen in experiments that a series of collapses can be applied to the output of the first algorithm to find a triangulation of the manifold.

• Miroslav Kramar, Inria Saclay

A comparison framework for interleaved persistence modules

I will  present a generalization of the induced matching theorem  and use it to prove a generalization of the algebraic stability theorem for $\R$-indexed pointwise finite-dimensional persistence modules.
Via examples I will show how the generalized algebraic stability theorem enables the computation of rigorous error bounds in the space of persistence diagrams that go beyond the typical formulation in terms of bottleneck (or log bottleneck) distance.

• Siddharth Pritam, Inria Sophia

Strong Collapse for Persistence

We introduce a fast and memory efficient approach to compute the persistent homology (PH) of a sequence of simplicial complexes. The basic idea is to simplify the complexes of the input sequence by using strong collapses, as introduced by J. Barmak and E. Miniam, and to compute the PH of an induced sequence of reduced simplicial complexes that has the same PH as the initial one. Our approach has several salient features that distinguishes it from previous work. It is not limited to filtrations (i.e. sequences of nested simplicial subcomplexes) but works for other types of sequences like towers and zigzags. To strong collapse a simplicial complex, we only need to store the maximal simplices of the complex, not the full set of all its simplices, which saves a lot of space and time. Moreover, the complexes in the sequence can be strong collapsed independently and in parallel. As a result and as demonstrated by numerous experiments on publicly available data sets, our approach is extremely fast and memory efficient in practice.

• Hannah Schreiber, TU Graz

Morse Complexes for Zigzag Persistent Homology

Discrete Morse Theory is used in standard persistent homology to simplify the initial filtration and therefore to accelerate the persistence computing time. We will adapt the idea to the more general case of zigzag filtrations.

• Mathijs Wintraecken, Inria Sophia

Triangulating stratified manifolds: a reach comparison theorem

We define the reach for submanifolds of Riemannian manifolds, in a way that is similar to the Euclidean case. Given a $d$-dimensional submanifold $\mathcal{S}$ of a smooth Riemannian manifold $\mathcal{M}$ and a point $p \in \mathcal{M}$ that is not too far from $\mathcal{S}$ we want to give bounds on local feature size of $\exp_p^{-1}(\mathcal{S} )$. Here $\exp_p^{-1}$ is the inverse exponential map, a canonical map from the manifold to the tangent space. Bounds on the local feature size of $\exp_p^{-1}( \mathcal{S} )$ can be reduced to giving bounds on the reach of $\exp_p^{-1} (B(c,r))$, where $B(c,r)$ is a geodesic ball, or equivalently on $\exp_p^{-1}\circ \exp_c (B (c,r))$, where now $B(c,r)$ is a ball in the tangent space. To establish bounds on the reach of $\exp_p^{-1}\circ \exp_c (B (c,r))$, we use bounds on the metric and on its derivative in Riemann normal coordinates. This result is a first step towards answering the important question of how to triangulate stratified manifolds.

This is joint work with Jean-Daniel Boissonnat.

## Practical informations:

##### Place:

IGESA Center of  Hyères (Var)

##### Accommodation:

The number of single room is limited.

Accommodation in double room for Phd Students and interns.

Please, send an email to Stéphanie (stephanie.aubin@inria.fr) and Florence (florence.barbara@inria.fr)  indicating with whom you wish to share your room. Without response, the distribution will be done randomly.

##### Mission:

Apply for your mission as soon as possible and charge it to:

. Agents from Inria Sophia: DB 04RECH6016-S – GUDHI – Allocation 8349

. Agents from Inria Saclay: DB 11RECH1076-S – FUJITSU – Allocation 12749

##### Transportation:
###### From Saclay:

Plane:

. Tuesday May 22, Paris Orly/Toulon: 9:10 –>  10:35 (recommanded flight)

. Thursday May 24, Toulon/Paris Orly: 16:20 –> 17:50 ((recommanded flight)

From Toulon airport to IGESA (7 km):

Taxi: Taxis Hyèrois : 04 94 00 60 00

Bus: Toulon Airport/Hyères Centre Joffre – Lines : 63 – it takes about 20 mn – Stop at around 15 mn walking distance from IGESA.

###### From Sophia:

By car (130 Km): A8 exit  Hyères/Toulon/Le Luc –  around 1:30 mn

You can park your vehicule at IGESA.

. Return from IGESA: Thursday May 24, 15:00 (arrival around 16:30)