ContentCheck: Models, Algorithms and Tools for Data Journalism and Journalistic Fact-Checking
Fact-checking is the task of assessing the factual accuracy of claims, typically prior to publication. Modern fact-checking is faced with a triple revolution in terms of scale, complexity, and visibility, as claims and background knowledge are increasingly digital.
ContentCheck brings together experts in data management, natural language processing, automated reasoning and data mining from Inria, LIMSI/CNRS and U. Paris Saclay, U. Rennes 1, U. Lyon 1, and the fact-checking team “Les Décodeurs” from Le Monde, the leading French national newspaper. We work to establish fact-checking as a data management problem, endow it with sound foundations, design and deploy novel algorithms for automating fact-checking, and validate them by close interaction with the journalists.
ContentCheck talks, presentations and general-audience publications:
- Ioana Manolescu participated to a televised debate on fake news and the media in “Le Grand Barouf Numérique”, a technology/society meet up organised by the city of Lille, March 2018
- “La science se met au service de l’information” , in “Le Journal Toulousain” (PDF)
- Ioana Manolescu in the Inria Alumni session “Fake news and post-truth: straight to the facts!”, CNAM, October 2017
- “Les chercheurs en informatique au service du fact-checking”, in the DIXIT newsletter of Ecole Polytechnique, June 2017
- “La vérité, rien que la vérité” article by Ioana Manolescu in the “Blog Binaire” of Le Monde (computer science topics presented to the general audience), April 2017
- Les Décodeurs published an article analyzing the Twitter content produced during the 2016 primaries of the right, in view of the 2017 French election campaign. The article is based on a collection of public tweets, stored and analyzed as part of the ContentCheck project.
- All ContentCheck mentions in the news: Journal du CNRS, Ouest France, Le Devoir (Canada), INRIA news, Le Monde, Rue 89, NLTO, Sciences pour tous, interstices.info, Le Journal Toulousain (PDF)
ContentCheck scientific publications:
- Search for Truth in a Database of Statistics, Tien-Duc Cao, Ioana Manolescu, Xavier Tannier, WebDB workshop, co-located with ACM SIGMOD, 2018.
- A Content Management Perspective on Fact-Checking, Sylvie Cazalens, Philippe Lamarre, Julien Leblay, Ioana Manolescu and Xavier Tannier, “Journalism, Misinformation and Fact Checking” track at The Web Conference (formerly WWW Conference) 2018
- ContentCheck: Content Management Techniques and Tools for Fact-checking, “Digital Humanities” issue of ERCIM News, 2017
- Flash points: Discovering exceptional pairwise behaviors in vote or rating data, Adnene Belfodil, Sylvie Cazalens, Philippe Lamarre, Marc Plantevit. ECML/PKDD conference, 2017.
- Associated material available at: https://github.com/Adnene93/DiscoveringSimilarityChanges/
- Extracting Linked Data from statistic spreadsheets Tien Duc Cao, Ioana Manolescu, Xavier Tannier, Workshop on Semantic Big Data (SBD), next to the SIGMOD conference, 2017.
- Mixed-instance querying: a lightweight integration architecture for data journalism Raphaël Bonaque, Tien Duc Cao, Bogdan Cautis, François Goasdoué, Javier Letelier, Ioana Manolescu, Oscar Mendoza, Swen Ribeiro, Xavier Tannier, Michaël Thomazo VLDB, Sep 2016, New Delhi, India. VLDB, <http://vldb2016.persistent.com/>
- Creation, Visualization and Edition of Timelines for Journalistic Use by Xavier Tannier and Fréderic Vernier, Natural Language Processing Meets Journalism workshop next to IJCAI 2016 http://nlpj2016.fbk.eu
Other ContentCheck presentations:
- Ioana Manolescu gave a presentation on data management for fact-checking in the “AI and Big Data” day at ETIS lab, U. Cergy Pontoise.
- Ioana Manolescu participated to a fact-checking panel at Web2Day 2017, a 3000-strong IT and digital conference in Nantes
- Michaël Thomazo presented FactMinder, our system for archiving, annotating, and querying semantic-rich Web conten at the Tech & Check Conference of the Duke University in North Carolina
- Ioana Manolescu and Samuel Laurent gave invited presentations at the Computation and Journalism workshop http://compjournalism2016.irisa.fr/ co-organized by Xavier Tannier (March 2016, Rennes)
- Fourth project meeting: Nov 13, 2017 @ Le Monde
- Les Décodeurs’s presentation, Maxime Ferrer
- Searching for truth in a database of statistics,Tien Duc Cao, Ioana Manolescu, Xavier Tannier
- Flashpoints: Mining exceptional pairwise behavior in vote datasets, Adnene Belfodil, Philippe Lamarre, Marc Plantevit, Sylvie Cazalens
- Recherche par mots-clé dans des bases de données hétérogènes, Camille Chanial, Redouane Dziri, Helena Galhardas, Julien Leblay, Minh Huong Le Nguyen, Ioana Manolescu
- Gestion de points de vue sur des données RDF, Ludivine Duroyon, François Goasdoué, Ioana Manolescu
- Cedar — Le Monde — LIMSI meeting: Aug 28, 2017 @Skype meeting
- Cedar — Le Monde — LIMSI meeting: Jun 28, 2017 @ Le Monde
- Cedar — Le Monde — LIMSI meeting: Apr 11, 2017 @ Le Monde
- Third project meeting: Mar 07, 2017 @ Le Monde
- CEDAR — Le Monde — LIMSI meeting: Nov 11, 2016 @ Skype meeting
- CEDAR — Le Monde — LIMSI meeting: Oct 11, 2016 @ Skype meeting
- CEDAR — Le Monde — LIMSI meeting: Oct 18, 2016 @ Skype meeting
- CEDAR — Le Monde — LIMSI meeting: Sep 27, 2016 @ Le Monde
- Second project meeting: July 5, 2016 @ Le Monde
- Presentation of “Creation, Visualization and Edition of Timelines for Journalistic Use” by Xavier Tannier and Frédéric Vernier
- Presentation of “Towards Automatic Topic Assignment for News Articles” by Tien-Duc Cao, Ioana Manolescu and Xavier Tannier
- (Brief) presentation of “Tatooine, a lightweight integration architecture for data journalism” by Raphaël Bonaque, Tien Cao, Bogdan Cautis, François Goasdoué, Javier Letelier, Ioana Manolescu, Oscar Mendoza, Swen Ribeiro, Xavier Tannier, Michaël Thomazo
- Discussion on fact-checking procedures and datasets
- CEDAR — Le Monde — LIMSI meeting: May 30, 2016 @ INRIA, Palaiseau
- First project meeting: January 29, 2016 @ Le Monde in Paris
- INRIA – CEDAR team: models, languages, and efficient database techniques for complex, semantic-rich data
- IRISA, Université de Rennes 1 – SHAMAN team: automated reasoning, representing and querying knowledge, and integrating heterogeneous information sources
- LIMSI, CNRS – ILES team: natural language processing, information retrieval and extraction, event extraction and information
Xavier Tannier, Brigitte Grau, Patrick Paroubek
- LIRIS, CNRS – DB & DM2L teams: data management and mining, in particular declarative and logical approaches, query evaluation, heterogeneous data integration in large-scale distributed systems, and constraint-based pattern mining in particular for spatio-temporal data
Philippe Lamarre, Sylvie Cazalens, J.-Marc Petit, Marc Plantevit, Céline Robardet
- Le Monde, Les Décodeurs: major leading French newspaper and fact-checking pioneer in the French news industry; provides application knowledge, helps define a typology of needs, builds a reference set for evaluation, and proof-tests the ideas and tools produced by the project
Samuel Laurent, Aline Rouyer, Ludovic Werwinski