Kun Zhang defends his PhD

The team is once again celebrating a major achievement by one of our own—Kun Zhang has successfully defended his PhD! 

Kun’s thesis, “Contributions to Evaluating and Improving the Faithfulness for Text Generation,” is a remarkable achievement in a field that is both technically demanding and rapidly evolving. His research stands out not just for its depth, but for its relevance to some of the most pressing challenges in AI today. During his defense, he demonstrated exceptional command of his subject and fielded every question with clarity, insight, and calm confidence. 

Beyond his work, anyone who’s had the chance to chat with Kun knows he’s not just a brilliant researcher, he’s also endlessly curious about the world, funny in unpredictable ways and a joy to talk to. He brings energy and insight to every exchange, and it’s been a privilege to work alongside him. 

Congratulations, Dr. Zhang! We can’t wait to see where your talents take you next.

Thesis Supervisor 

Ioana Manolescu, Directrice de recherche, Inria Saclay (LIX) Directrice de thèse

Oana Balalau, Inria Starting Faculty, Inria Saclay (LIX) Co-directrice de thèse

Defense Jury 

Eric de la Clergerie Directeur de recherche, Inria Paris Examinateur
Claire Gardent Directrice de Recherche, CNRS, Universit´e de Lorraine (Loria) Rapporteur
Pierre-Henri Paris, Maître de Conférences, Université de Paris Saclay (LISN), Examinateur
Simon Razniewski, Professor, Technical University Dresden and ScaDS.AI, Rapporteur
Laure Soulier, Maîtresse de Conférences, HDR, Sorbonne Université et CNRS (ISIR), Examinatrice

Thesis Abstract 

Ensuring the factual faithfulness in text generation is a critical challenge, especially in knowledge-grounded tasks where generated text must accurately reflect the content given by structured or unstructured input data. While modern text generation models have achieved good fluency and coherence, they frequently generate hallucinated outputs, for instance, ignoring crucial details, misinterpreting relationships, or introducing contradictions. Therefore, this thesis presents new methods for both evaluating and improving the factual faithfulness of text generation.

The first contribution of the thesis is FactSpotter, a novel metric designed to assess the faithfulness in graph-to-text generation. Unlike approaches based on n-grams or embeddings of language models, FactSpotter evaluates whether each key fact in the input knowledge graph is correctly verbalized in the generation. It leverages a self-supervised classifier to distinguish between the faithful and unfaithful representations. Beyond triple-level evaluation for graph-to-text generation, FactSpotter can also be integrated into beam search progress as a soft constraint, encouraging more faithful text generation without compromising fluency.

Beyond graph-to-text generation, this thesis explores cross-text factual consistency verification, assessing whether multiple texts convey the same factual content. A structured discourse representation format is introduced to address the limitations of traditional triple-based representations. This format captures not only richer atomic details, such as direct and indirect objects, adverbials, and complements, but also discourse-level relations, including temporality, comparison, and contingency. A new dataset, DiscInfer, is annotated to train entailment-based models to detect factual inconsistencies at both the atomic and discourse levels.

Empirical results show that the proposed methods in the thesis enhance factual consistency evaluation and faithful natural language generation. FactSpotter-guided decoding effectively mitigates hallucinations in structured-text generation, while structured discourse representation strengthens factual consistency verification between texts. Together, these contributions help to build more trustworthy text generation systems, offering insights into improving fact-grounded generation across various applications.