CodeCommons (2025-2026)

CodeCommons (2025-2026) is a two-year project building on the foundation of Software Heritage, the world’s largest public source code archive. The project, led by Roberto Di Cosmo (Inria), is funded by the Banque Publique d’Investissement (BPI) with academic and industry and partners in France (Inria, AboutCode, ALManaCH, CEA, DiverSE, Tweag) and Italy (Università di Pisa, Università degli Studi di Torino). The goal of the project is to expand and enhance the archive, consolidating critical, qualified information needed to create smaller, higher-quality datasets for the next generation of responsible AI tools. In this project, Patrick Valduriez helps as a consultant for the big data infrastructure.