Permanent researchers (PIs)
- Gaël Varoquaux (team leader), Research Director
- Marine Le Morvan, Research Scientist (chargée de recherche)
- Jill-Jênn Vie, Research Scientist (chargé de recherche)
- Judith Abécassis, Research Scientist (ISFP)
Senior research members
- Léa Hoisnard (APHP)
Junior research members
Students
- Félix Lefebvre – PhD Student. Working on large-scale graph-embedding methods to represent large relational stores.
- Samuel Girard – PhD Student. Reinforcement learning in education.
- Julie Alberge – PhD Student. Modeling trajectories of diabetic patients from AP-HP.
- Jovan Stojanovic – PhD Student. Targeting Strategies in Political Networks: machine-learning insights on Twitter data.
- Marie Generali – Fairness in educational data mining
- Sébastien Melo – Grouping losses
- Célestin Eve
Post-docs
- Riccardo Cappuzzo – Post-doc, working on assembling features across relational databases
- Jun Kim – Post-doc, working on graph neural networks for relational databases
- Lihu Chen – Natural Language Processing and Large Language Models
- Jingang Qu – Tabular foundation models
- David Holzmüller – Tabular machine learning
Engineers
- Jérôme Dockès – Research software engineer. Working on skrub
- Hiba Bederina – Research engineer. Working with Pass Culture
Interns
- Anav Agrawal
Collaborators
- Linus Bleistein
- Julie Josse
- Clémence Réda
- Jean Vassoyan
Research Team Assistant
- Ekaterina “Katia” George
Alumni
- Samuel Brasil de Alburquerque – Working on diabetes epidemiology from observational health informatics.
- Alexis Cvetkov-Iliev – Working on statistical analysis across relational databases with embeddings.
- Bénédicte Colnet – Working on causal inference, with a focus on assessing randomized controlled trials’ external validity.
- Julien Jerphanion – Research software engineer. Core developer to scikit-learn since 2021.
- Lilian Boulard – Software engineering apprentice. Working on skrub.
- Matthieu Doutreligne – Working on transfer learning and causal inference for public health, in partnership with HAS.
- Tomas Rigaux – Research engineer. Working on recommendations for the job market.
- Léo Grinsztajn – PhD Student. Working on neural networks for tabular and relational data.
- Alexandre Perez – PhD Student. Working on supervised learning in the presence of missing values and assessment of classification confidences through calibration and grouping loss.
- Clémence Réda – Marie Skłodowska-Curie post-doc on project Robust Explainable Controllable Standard for drug Screening (RECeSS), now at CNRS
Scikit-learn team
Soda hosted the core scikit-learn development team up to 2024.
- Arturo Amor-Quiroz – Research software engineer and PhD in physics. Focused on the scikit-learn documentation.
- Jérémie du Boisberranger – Research software engineer and physicist. Core developer to scikit-learn since 2019.
- Franck Charras – Research software engineer. Working on a GPU programming project, within a partnership with Intel®.
- Vincent Maladière – Research software engineer, focusing on data wrangling, survival analysis, and MLOps. Working on scikit-learn, skrub, and hazardous. Collaboration on health data with AP-HP.
- Loïc Estève – Research software engineer and physicist. Core developer to scikit-learn since 2016.
- Olivier Grisel – Research software engineer. Core developer to scikit-learn since 2010.
- François Goupil – Research software engineer. Animates our community, manages the operations of the consortium and the relationship with our patrons
- Guillaume Lemaître – Research software engineer. Core developer to scikit-learn since 2017.