4 et 5 février 2016

Le séminaire aura lieu au LORIA (access instructions), salle A008.

Programme

Thursday 4 February

Time	Speaker	Title
10:00-10:10	Welcome
10:10-11:00	Jiří Maršík	DRT as a Programming Language
11:00-11:50	Richard Moot	Comparing and Evaluating Extended Lambek Calculi
11:50-12:40	Chloé Braud	Identification automatique des relations discursives implicites à partir de corpus annotées et de données brutes
12:40-14:40	Lunch
14:40-15:30	Noortje Venhuizen	Projection in Discourse: A data-driven formal semantic analysis
15:30-16:20	Justyna Grudzińska-Zawadowska, Marek Zawadowski	Generalized Quantifiers on Dependent Types: A System for Anaphora
16:20-16:40	Break
16:40-17:30	Nicholas Asher	Travaux lexicaux pour l’analyse du discours et la sémantique compositionnelle

Friday 5 February

Time	Speaker	Title
10:00-10:50	Corinne Rossari	The Influence of mais on the Paradigm of Epistemic Adverbs: a Quantitative and Qualitative Study
10:50-11:40	Laurence Danlos	Building Discourse Connective Lexicons
11:40-12:30	Timothée Bernard	Convergence des propriétés syntaxiques, sémantiques et pragmatiques des connecteurs de discours
12:30-14:30	Lunch
14:30-15:20	Christian Retoré	Quantification: its Syntax and Semantics (survey)
15:20-16:10	Sylvain Pogodalla	Interfacing Discourse and Sentential TAG-Based Grammars
16:10	Closing

Participants

Confirmed participants as of today:

Maxime Amblard (LORIA, Université de Lorraine, Nancy)
Nicholas Asher (IRIT, CNRS, Toulouse
Timothée Bernard (Alpage, Université Paris 7)
Clément Beysson (LORIA, Université de Lorraine)
Chloé Braud (Department of Nordic Research, University of Copenhagen, Danemark)
Laurence Danlos (Alpage, Université Paris 7)
Philippe de Groote (LORIA, INRIA Nancy)
Justyna Grudzińska-Zawadowska, (Institute of Philosophy, University of Warsaw, Poland)
Jiří Maršík (LORIA, INRIA Nancy)
Aleksandre Mashkarashvili (LORIA, INRIA Nancy)
Richard Moot (LaBRI, CNRS, Bordeaux)
Jonata Poitz (Max Planck Institut für Informatik)
Sylvain Pogodalla (LORIA, INRIA Nancy)
Christian Retoré (LIRMM, Université de Montpellier)
Corinne Rossari (Université de Neuchâtel, Switzerland)
Noortje Venhuizen (Computational Linguistics and Phonetics department, Saarland University)
Marek Zawadowski (Institute of Mathematics, University of Warsaw, Poland)

Abstracts

Identification automatique des relations discursives implicites à partir de corpus annotées et de données brutes

Chloé Braud (Department of Nordic Research, University of Copenhagen, Danemark)

Le développement de systèmes d’analyse discursive automatique des documents est un enjeu actuel majeur en Traitement Automatique des Langues. La difficulté principale correspond à l’étape d’identification des relations (comme Explication, Contraste . . .) liant les segments constituant le document. En particulier, l’identification des relations dites implicites, c’est-à-dire non marquées par un connecteur discursif (comme mais, parce que . . .), est réputée difficile car elle nécessite la prise en compte d’indices variés et correspond à des difficultés particulières dans le cadre d’un système de classification automatique. Dans cette thèse, nous utilisons des données brutes pour améliorer des systèmes d’identification automatique des relations implicites.

Nous proposons d’abord d’utiliser les connecteurs pour annoter automatiquement de nouvelles données. Nous mettons en place des stratégies issues de l’adaptation de domaine qui nous permettent de gérer les différences en termes distributionnels entre données annotées automatiquement et manuellement : nous rapportons des améliorations pour des systèmes construits sur le corpus français ANNODIS et sur le corpus anglais du Penn Discourse Treebank. Ensuite, nous proposons d’utiliser des représentations de mots acquises à partir de données brutes, éventuellement annotées automatiquement en connecteurs, pour enrichir la représentation des données fondées sur les mots présents dans les segments à lier. Nous rapportons des améliorations sur le corpus anglais du Penn Discourse Treebank et montrons notamment que cette méthode permet de limiter le recours à des ressources riches, disponibles seulement pour peu de langues.

Projection in Discourse: A data-driven formal semantic analysis

Noortje Venhuizen (Computational Linguistics and Phonetics department, Saarland University)

In this talk, I present a unified, data-driven formal semantic analysis of projection phenomena, which include presuppositions, anaphoric expressions, and conventional implicatures (as defined by Potts, 2005). The different contributions made by these phenomena are explained in terms of the notion of information status. Based on this analysis, I present a new semantic formalism called Projective Discourse Representation Theory (PDRT). PDRT is an extension of traditional Discourse Representation Theory (Kamp, 1981; Kamp and Reyle, 1993), which directly implements the anaphoric theory of presuppositions (van der Sandt, 1992) by means of the introduction of projection variables. I show that PDRT captures the differences, as well as the similarities between the contributions made by presuppositions, anaphora and conventional implicatures. In order to illustrate PDRT’s representational power, I present a data-driven computational analysis of the information status of referential expressions based on data from the Groningen Meaning Bank; a corpus of semantically annotated texts (Basile et al., 2012). Taken together, the results pave way for a more integrated formal and empirical analysis of different aspects of linguistic meaning.

Generalized Quantifiers on Dependent Types: A System for Anaphora

Justyna Grudzińska-Zawadowska, (Institute of Philosophy, University of Warsaw, Poland) and Marek Zawadowski (Institute of Mathematics, University of Warsaw, Poland)

In our paper we propose a system for the interpretation of anaphoric relationships between unbound pronouns and quantifiers. The main technical contribution of our proposal consists in combining generalized quantifiers (Mostowski, 1957; Lindström, 1966; Barwise and Cooper, 1981) with dependent types (Martin-Löf, 1972, 1984; Makkai, 1995). This combination allows to provide a uniform mechanism to account for the main kinds of unbound anaphora (maximal anaphora to quantifiers, quantificational subordination, cumulative and branching continuations, `donkey anaphora’), with the anaphoric effects falling out naturally as a consequence of having generalized quantification on dependent types. I will first define the syntax and semantics of the system. Next, I will describe our process of English-to-formal language translation. Finally, I will present some applications of the system (this is joint work with Justyna Grudzinska).

Building Discourse Connective Lexicons

Laurence Danlos (Alpage, Université Paris 7)

Discourse connectives are (simple or compounds) lexical items that express discourse relations between two discourse segments. For French we developed a lexicon of connectives which records for each entry its syntactic category and its sense(s) (i.e. which discourse relation(s) it signals) along with possible other information (e.g. constraint on its position) and examples. A first version, developed from linguistic knowledge, was revised after a discourse annotation experiment, and we will present the two methods. Connective lexicons exist for other languages, German for example. We will make a cross-linguistic comparison of the two resources.

Quantification: its Syntax and Semantics. (survey)

Christian Retoré (LIRMM, Université de Montpellier)

In this non technical talk I shall survey some of the various means to analyse and compute the syntactic analyses of sentences involving quantifiers the corresponding logical formulae and the way they can be interpreted.

Interfacing Discourse and Sentential TAG-Based Grammars

Sylvain Pogodalla (LORIA, INRIA Nancy), joint work with Laurence Danlos (Alpage, Université Paris 7) and Aleksandre Mashkarshvili

We present a method to interface a sentential grammar and a discourse grammar without resorting to an intermediate processing step. The method is general enough to build discourse structures that are direct acyclic graphs (DAG) and not only trees. Our analysis is based on Discourse Synchronous TAG (D-STAG), a Tree-Adjoining Grammar (TAG)-based approach to discourse. We also use an encoding of TAG into Abstract Categorial Grammar (ACG). This encoding allows us to express a higher-order semantic interpretation that enables building DAG discourse structures on the one hand, and to smoothly integrate the sentential and the discourse grammar thanks to the modular capability of ACG.

5ème réunion

4 et 5 février 2016

Programme

Thursday 4 February

Friday 5 February

Participants

Abstracts

Identification automatique des relations discursives implicites à partir de corpus annotées et de données brutes

Projection in Discourse: A data-driven formal semantic analysis

Generalized Quantifiers on Dependent Types: A System for Anaphora

Building Discourse Connective Lexicons

Quantification: its Syntax and Semantics. (survey)

Interfacing Discourse and Sentential TAG-Based Grammars