Seminars

Links' Seminars and Public Events Add to google calendar
2021
Fri 19th Mar
10:00 am
12:00 pm
Add event to google
Seminar Pablo Ferragin
Title: Theory and practice of learning-based compressed data structures

Presenter: Giorgio Vinciguerra

Abstract:
We revisit two fundamental and ubiquitous problems in data structure design:
predecessor search and rank/select primitives. We show that real data present a
peculiar kind of regularity based on geometric considerations. We name it
“approximate linearity”.
We thus expand the horizon of compressed data structures by presenting two
solutions for the problems above that discover, or “learn”, in a principled
algorithmic way, these approximate linearities. We provide a walkthrough of
these new theoretical achievements, also with a focus on open-source libraries
and their experimental improvements. We conclude by discussing the plethora of
research opportunities that these new learning-based approaches to data
structure design open up.

Zoom link: univ-lille-fr.zoom.us/j/95419000064
Fri 12th Mar
10:00 am
12:00 pm
Add event to google
Seminar: Antonio AL SERHALI
Title: Can Earliest Query Answering on Nested Streams be achieved in Combined Linear Time?
Fri 19th Feb
10:00 am
11:00 am
Add event to google
Seminar: Bernardo Subercaseau
Title: Foundations of Languages for Interpretability.

Abstract:
The area of interpretability in Machine Learning aims for the design of algorithms that we humans can understand and trust. One of the fundamental questions of interpretability is: given a classifier M, and an input vector x, why did M classify x as M(x)? In order to approximate an answer to this "why" question, many concrete queries, metrics and scores have emerged as proxies, and their complexity has been studied over different classes of models. Many of these analyses are ad-hoc, but they tend to agree on the fact that these queries and scores are hard to compute over Neural Networks, but easy to compute over Decision Trees. It is thus natural to think of a more general approach, like a query language in which users could write an arbitrary number of different queries, and that would allow for a generalized study of the complexity of interpreting different ML models. Our work proposes foundations for such a language, tying to First Order Logic, as a way to have a clear understanding of its expressiveness and complexity. We manage to define a minimalistic structure over FO that allows expressing many natural interpretability queries over models, and we show that evaluating such queries can be done efficiently for Decision Trees, in data-complexity.

Zoom link: univ-lille-fr.zoom.us/j/95419000064
Fri 12th Feb
10:00 am
12:00 pm
Add event to google
Seminar: Florent Capelli
Title: Regularizing the delay of enumeration algorithms
Zoom link: univ-lille-fr.zoom.us/j/95419000064
Abstract: Enumeration algorithms are algorithms whose goal is to output the set
of all solutions to a given problem. There exists different measures for the
quality of such algorithm, whose relevance depends on what the user wants to do
with the solutions set.

If the goal of the user is to explore some solutions or to transform the
solutions as they are outputted with a stream-like algorithm, a relevant measure
of the complexity of an enumeration algorithm is the delay between the output of
two distinct solutions. Following this line of thoughts, significant efforts
have been made by the community to design polynomial delay algorithms, that is,
algorithms whose delay between the output of two new solutions is polynomial in
the size of the input.

While this measure is interesting, it is not always completely necessary to have
a bound on the delay and it is enough to ask for a guarantee that running the
algorithm for O(t poly(n)) will result in the output of at least t solutions. Of
course, by storing each solution seen and outputting them regularly, one can
simulate a polynomial delay but if the number of solutions is large, it may
result in a blow up in the space used by the enumerator.

In this talk, we will present a new technique that allow to transform such
algorithm into polynomial delay algorithm using polynomial space.

This is joint work with Yann Strozecki.
Fri 15th Jan
10:00 am
12:00 pm
Add event to google
Séminaire de Kim Nguyễn
Titile: The BOLDR project
Abstract: I
n this presentation, I will give an account of the BOLDR project and
perspectives in the field of language integrated queries.

Several classes of solutions allow programming languages to express
queries: specific APIs such as JDBC, Object-Relational Mappings (ORMs)
such as Hibernate, and language-integrated query frameworks such as
Microsoft's LINQ. However, most of these solutions do not allow for
efficient cross-databases queries, and none allow the use of complex
application logic from the programming language in queries.

We study the design of a new language-integrated query
framework called BOLDR that allows the evaluation in databases of
queries written in general-purpose programming languages containing
application logic, and targeting several databases following different
data models. In this framework, application queries are translated to
an intermediate representation. Then, they are typed with a type
system extensible by databases in order to detect which database
language each subexpression should be translated to. This type system
also allows us to detect a class of errors before execution. Next,
they are rewritten in order to avoid query avalanches and make the
most out of database optimizations. Finally, queries are sent for
evaluation to the corresponding databases and the results are
converted back to the application. Our experiments show that the
techniques we implemented are applicable to real-world database
applications, successfully handling a variety of language-integrated
queries with good performances.

This talk will give an overview of what has been achieved so far (mainly
in the context of Julien Lopez' PhD Thesis) and will glimpse at preliminary
work that is being done in the context of a collaboration with Oracle Labs.
Fri 8th Jan
10:45 am
12:30 pm
Add event to google
Séminaire @ Lê Thành Dũng (Tito) Nguyễn
Title: The planar geometry of first-order string transductions (joint work with Pierre Pradic)


Abstract:
hal.archives-ouvertes......ument

We propose a new machine model recognizing star-free languages, with a geometric flavor. Our starting point is the characterization of regular languages using two-way automata (2DFA). The idea is to take seriously the visual representations found throughout the literature of the behavior of a 2DFA on a word ; by putting a total order on the set of states, one can formally define what it means for such a behavior to be planar, in a sense analogous to the planarity of combinatorial maps. Star-free languages are then exactly the languages recognized by "planar 2DFA". We also show that the corresponding planar transducer model characterizes the class of first-order transductions (a.k.a. aperiodic regular functions). If time allows, the talk will briefly discuss the connections of this work with the non-commutative lambda-calculus (cf. our recent paper Aperiodicity in a non-commutative logic, ICALP'20).


2020
Thu 17th Dec
2:00 pm
4:00 pm
Add event to google
Nofar Carmeli
Speaker: Nofar Carmeli (nofar.carme.li/)

Zoom link: univ-lille-fr.zoom.us/j/95419000064

Title: The Complexity of Answering Unions of Conjunctive Queries.

Abstract:
We discuss the fine-grained complexity of enumerating the answers to a query over a relational database. With the ideal guarantees, linear time is required before the first answer to read the input and determine its existence, and then we need to print the answers one by one. Consequently, we wish to identify the queries that can be solved with linear preprocessing time and constant or logarithmic delay between answers. A known dichotomy classifies CQs into those that admit such enumeration and those that do not. The computationally expensive component of query answering is joining tables, which can be done efficiently if and only if the join query is acyclic. However, the join query usually does not appear in a vacuum; for example, it may be part of a larger query, or it may be applied to a database with dependencies. We inspect how the complexity changes in these settings and chart the borders of tractability within. In addition, we consider the task of enumerating query answers with a uniformly random order, and we propose to do so using an efficient random-access structure for representing the set of answers. We also prove conditional lower bounds showing that our algorithms capture all tractable queries in some cases. Among our results, we show that a union of tractable conjunctive queries may be intractable w.r.t. random access; on the other hand, a union of intractable conjunctive queries may be tractable w.r.t. enumeration.
Fri 11th Dec
10:00 am
11:30 am
Add event to google
Alexandre Vigny
Title: Elimination Distance to Bounded Degree on Planar Graphs
Link to the zoominar: univ-lille-fr.zoom.us/j/95419000064
Abstract:
What does it mean for a graph to almost be planar? Or to almost have bounded
degree?
On such simple graphs classes, some difficult algorithmic problems become
tractable.
Ideally, one would like to use (or adapt) existing algorithms for graphs that
are "almost" in such a simple class.

In this talk, I will discuss the notion of elimination distance to a class C, a
notion introduced by Bulian and Dawar (2016).
The goals of the talk are:
1) Define this notion, and discuss why it is relevant by presenting some
existing results.
2) Show that we can compute the elimination distance of a given planar graph to
the class of graph of degree at most d.
I.e. answer the question: "Is this graph close to a graph of bounded degree?"

The second part is the result of a collaboration with Alexandre Lindermayer and
Sebastian Siebertz.

Fri 4th Dec
10:00 am
11:00 am
Add event to google
Seminar: Pierre Pradic
Title: Extracting nested relational queries from implicit definitions

Abstract:
arxiv.org/pdf/2005.06503.pdf

In this talk, I will present results obtained jointly with Michael
Benedikt establishing a connection between the Nested Relational
Calculus (NRC) and sets implicitly definable using Δ₀ formulas.

Call a formula φ(I,O) an implicit definition of the relation O(x,...) in
terms of I(y,...) if O is functionally determined by I: for every I, O,
O', if both φ(I,O) and φ(I,O') hold, then we have O ≡ O'. When φ is
first-order and I and O are relations over base sorts, then Beth's
definability theorem states that there is a first-order formula
ψ(I,x,...) corresponding to O whenever φ(I,O) holds. Further, this
explicit definition ψ can be effectively be computed from a sequent
calculus proof witnessing that φ is functional. This allows for
practical use of implicit definitions in the context of database
programming, as there is a well-established link between fragments of
explicitly FO definable relations and relational calculi.

NRC is a conservative extension of relational calculi from database
theory with limited powerset types in addition to tupling and anonymous
base types. NRC expressions thus not only encompass flat relations over
primitive datatypes like SQL but also nested collections, while
remaining useful in practice.

We extend the above correspondence between first-order logic and flat
relational queries to NRC and implicit definitions using set-theoretical
Δ₀ formulas over (typed) nested collection. Our proof of the equivalence
goes through a notion of Δ₀-interpretation and a generalization of Beth
definability for multi-sorted structures. This proof is non-constructive
and thus does not yield any useful algorithm for converting implicit
definitions into NRC terms. Using an approach more closely related to
proof-theoretic interpolation, we give a constructive proof of the
result restricted to intuitionistic provability, i.e, when the input
functionality proof π of φ(I,O) is carried out in intuitionistic logic.
Further, if π is cut-free, this can be done efficiently. Whether or not
there exists a polynomial-time procedure working with classical proofs
of functionality is still an open problem.

I will focus on the effective result for the talk, and if time allows,
discuss the difficulties with extending it to classical logic. I will
not assume any background in either database or model theory.

Fri 27th Nov
10:00 am
11:30 am
Add event to google
Seminar: Charles Paperman
Title: Stackless processing of streamed trees

Abstract: In this talk, I will first present the state of the art of efficiency implementation of streaming-text algorithms on modern architecture. Then some recent results on the extraction of information on streamed of structured documents without stack overhead.

For more info: paperman.name/data/pub.....d.pdf

Permanent link to this article: https://team.inria.fr/links/seminars/