Senior Research Scientist, Inria
Scientific leader of the KerData research team at INRIA Rennes – Bretagne Atlantique Research Center and IRISA.
Inria Rennes Bretagne – Atlantique
Campus Universitaire de Beaulieu, 35042, Rennes
Office: D173 (orange level)
Phone: +33 (0)2 99 84 72 44
Fax : +33 (0)2 99 84 71 71
My main current research interests are related to Big Data management for large scale distributed infrastructures: clouds, exascale HPC systems, converged infrastructures for HPC and Big Data analytics
Decentralized management of massive data on highly distributed infrastructures;
Cloud data services and Big Data analytics;
- Scalable I/O and in situ visualization and processing for Exascale HPC systems;
- Convergence of Big Data and HPC: storage and processing architectures;
Scalable transparent data storage and sharing;
Scalable distributed file systems;
BLOB-based data management;
- Data consistency protocols.
- Track Chair of the ACM/IEEE SC19 (Supercomputing) conference for Clouds and Distributed Computing
- Member of the Best Poster Award and Student Research Competition Committee for IEEE/ACM SC18
- Community building for HPC and Big Data Convergence
- International: invited contributions to all recent BDEC and BDEC2 invitation-only workshops
- Europe: Inria representative in the BDVA working group on HPC-Big Data convergence
- Contributions to the Joint BDVA-ETP4HPC report on technology convergence just published (November 2018)
- France: Member of the Advisory Board of the HPC-BigData Inria Project and lead of the Frameworks Work Package
- Vice Executive Director of JLESC – Joint Inria- Illinois- ANL-BSC-JSC-RIKEN/AICS Laboratory for Extreme-Scale Computing for Inria since 2017
- Program Chair of the IEEE Cluster 2017 conference
Recent awards obtained by co-advised PhD students
- Best Student Paper Finalist at the Supercomputing Asia 2018 conference for Orçun Yildiz. More…
- Best Student Paper Finalist at the ACM/IEEE SC16 conference for the paper « Tyr: Blob Storage Meets Built-In Transactions » by Pierre Matri, Alexandru Costan, Gabriel Antoniu, Jesús Montes, Maria Pérez. Main author: Pierre Matri, PhD student co-advised with Maria Pérez (442 submissions, 81 accepted papers, 7 finalists). More…
- Third Award at the ACM Graduate Student Research Competition for Nathanaël Cheriere. Competition organized with the ACM/EEE SC16 conference. More…
- Accessit to the 2015 Gilles Kahn Honorary PhD Thesis Award of the SIF and the Academy of Science (second prize) for Matthieu Dorier (2011–2014), for his work on I/O variability on extra-scale computing systems. Co-advised with Luc Bougé. More…
- Second Gilles Kahn Honorary PhD Thesis Award of SPECIF and the Academy of Science for Bogdan Nicolae (2007–2010), for his work on the BlobSeer blob-based storage system. Co-advised with Luc Bougé. More…
Selected recent keynote talks, invited talks and panels
BDEC and BDEC2
BDEC2 is the second series of BDEC workshops. BDEC2 will focus on the problem that the previous series identified and analyzed, namely, the problem of defining and creating consensus around a shared cyberinfrastructure for science in the data saturated world that is now emerging. Since massive amounts of data will soon be getting generated nearly everywhere, massive amounts of computing and storage will have to be available for use at the “edge” or in the “fog,” as well in commercial Clouds and HPC centers. This is spelled out more detail in a short prospectus. BDEC workshops are closed and invitation-only. BDEC invites representatives of leading application communities to participate in one or more of the workshops and contribute to the BDEC prospective documents.
- BDEC2: The Sigma Data Processing Architecture: Leveraging Future Data for Extreme-Scale Data Analytics to Enable High-Precision Decisions. Invited talk at the 1st workshop of the BDEC2 series, Bloomington, November 2018. White paper available here.
- BDEC: Big Data and Extreme-Scale Computing: a Storage-Based Pathway to Convergence. Invited keynote talk at the 4th BDEC workshop, Frankfurt, June 2016.
Invited talks at other international events, panels
- Invited talk at Sintef: Convergence of HPC and Big Data: a Vision, Oslo, October 2018.
- 5 Invited talks at successive editions (2012 – 2018) of the workshop of the Joint Laboratory for Extreme-Scale Computing (JLESC).
- Invited talk at the Huawei European Research Center, Munich, in January 2018. Low-latency Storage for Stream Data.
- Invited talk at the Huawei European Research Center, Munich in October 2017. Convergence of HPC and Big Data.
- Keynote talk at the BigStorage and WALL ITN Joint Meeting, Mainz, January 2017. Týr: Storage-based Convergence Between HPC and Big Data.
- Invited Panelist at the Panel discussion on the HPC and Big Data convergence organized with the IEEE Cluster 2016 conference (Taipei, 2016).
- Inria/CIC-IPN workshop: Scalable Big Data Processing on Clouds: A-Brain and Z- CloudFlow, Centro de Investigación y Computación, Instituto Politécnico Nacional, Mexico City, November 2016.
- First Chinese-French Workshop on Extreme Computing: Damaris: Jitter-Free I/O Management and In Situ Visualization of HPC Simulations using Dedicated Cores, Guangzhou, May 2016.
- International lab management: JLESC – Joint Inria- Illinois- ANL-BSC-JSC-RIKEN/AICS Laboratory for Extreme-Scale Computing
- Vice Executive Director for Inria.
- JLESC Topic leader for Inria for data storage, I/O and in situ processing, coordinating JLESC collaboration activities in this area.
- Team management
- Head of the KerData Project-Team (INRIA-ENS Rennes-INSA Rennes).
- International Associate Team management
- Data@Exascale Associate Team with Argonne National Lab and the University of Illinois at Urbana Champaign (USA, 2013-2018),
- DataCloud@work Associate Team with University Politehnica of Bucharest (UPB), Romania (2010-2012).
- Research project management
- Coordinator of 12 projects: 5 with industry (2 with Microsoft within the Inria-Microsoft Research Joint Research Centre, 1 with Huawei, 1 with Total, 1 with Sun Microsystems), 2 ANR projects, 5 bilateral projects with international partners in USA, Japan, Romania.
- Partner PI (coordinator for Inria) in 5 other projects (2 European, 2 ANR national projects, 1 ANR-JST international project).
- Technology development project management
- Coordinator of 3 projects (ADT Blobseer, ADT Damaris, ADT Damaris 2 (2018–2021).
Organization of scientific events
- Program Chair, Vice Chair, Track Chair of CORE A international conferences
- IEEE Cluster
- Program Chair in 2017, with Richard Vuduc from Georgia Tech, USA ,as a Co-Chair (Honolulu).
- Program Chair in 2014 with Kate Keahey from Argonne National Lab, USA, as a Co-Chair (Madrid).
- Track Chair for the Data, storage and Visualization Track in 2015 (Chicago).
- ACM/IEEE SC
- Track Chair for the Clouds and Distributed Systems Area for the 2019 edition.
- ACM/IEEE CCGrid
- PC Vice-Chair for the Hybrid and Mobile Clouds Area: 2016 and 2017.
- Track Chair in 2011.
- IEEE Cluster
- EIT Digital Future Cloud Symposium: Program Co-Chair of the EIT Digital Future Cloud Symposium, Rennes, October 2015.
- Other chairing roles for international conferences
- 3PGCIC: Track Chair for the Distributed Algorithms Track in 2015.
- IEEE CloudCom: Track Chair for the Map-Reduce Track: 2011 and 2012.
- Publicity Chair for international conferences: IEEE/ACM CCGRID 2013, ACM HPDC 2012, Euro-Par 2007.
- Scientific workshops
- ScienceCloud – International workshop on Scientific Cloud Computing held every year in conjunction with the ACM HPDC conference (CORE A): General Co-Chair in 2012 (Delft), Program Chair in 2013 (New York);
- JLPC – Local Chair for the 7th workshop of the Joint Laboratory for Petascale Computing (JLPC), which later becameJLESC: the Joint Laboratory for Extreme Scale Computing, Rennes, 2012.
Participation to journal editorial boards
- Future Generation Computer Systems (CORE A): Special Issue on Mobile, hybrid, and heterogeneous clouds for cyberinfrastructures (Guest Editor, 2018), Special Issue on Resource Management for Big Data Platforms (Guest Editor, 2018)
- Concurrency and Computation (CORE A): Practice and Experience: Special issue on the Cloud computing for data-driven science and engineering workshop (Guest Editor, 2016).
Best Papers and Best Posters Committees
- Member of the Best Poster Award and Student Research Competition Committee for IEEE/ACM SC18
- Member of the Best Paper Award Committee for IEEE Cluster 2015, 2016 and 2017
Program Committees of recent CORE A international conferences with physical meetings
ACM HPDC 2012-2018, IEEE IPDPS 2018, 2019, ACM/IEEE SC (2013, 2015, 2017, 2018: Papers Committee; 2014, 2018: Posters Committee).
Program Committees of recent international conferences without physical meetings
ICDE 2017 (CORE A+), IEEE Cluster 2008, 2016 (CORE A), IEEE/ACM CCGRID 2013 and 2015 (CORE A), Euro-Par 2010-2012, 2015 (CORE A), IEEE HPCC 2012 (CORE B), IEEE AINA 2011-2012 (CORE B). (Workshops are omitted.)
Gabriel Antoniu is a Senior Research Scientist at Inria, Rennes. He leads the KerData research team, focusing on storage and I/O management for Big Data processing on scalable infrastructures (clouds, HPC systems). He currently serves as Vice Executive Director of JLESC – Joint Inria- Illinois- ANL-BSC-JSC-RIKEN/AICS Laboratory for Extreme-Scale Computing on behalf of Inria. He received his Ph.D. degree in Computer Science in 2001 from ENS Lyon. He leads several international projects in partnership with Microsoft Research, IBM, Argonne National Lab, the University of Illinois at Urbana Champaign, Huawei. He served as Program Chair for the IEEE Cluster conference in 2014 and 2017 and regularly serves as a PC member of major conferences in the area of HPC, cloud computing and Big Data (SC, HPDC, CCGRID, Cluster, Big Data, etc.). He has acted as advisor for 18 PhD theses and has co-authored over 140 international publications in the aforementioned areas.
A list of publications can be found on DBLP.
A comprehensive list of my publications can be found on HAL Open Archives Library:
- BlobSeer is a data management platform we are currently developing for sharing massive data at very large scales. It originally relies on advanced techniques for decentralized data management and versioning techniques to provide scalable data throughput under heavy data access concurrency.
- Damaris: Damaris is a middleware for multicore SMP nodes allowing them to efficiently handle data transfers for storage and visualization by dedicating one or a few cores to the application I/O or for in situ visualization. It has been developed within the framework of a collaboration with the Joint Laboratory for Extreme-Scale Computing (JLESC, ex-JLPC). It was successfully evaluated with the CM1 tornado simulation, one of the Blue Waters target applications, on several supercomputers (Titan, Jaguar, Kraken), where it demonstrated excellent scalability.
- JuxMem: is a platform which illustrates the concept of Grid Data-Sharing Service, defined using a hybrid approach based on Distributed Shared Memory and Peer-to-Peer techniques.
Leadership roles in ongoing projects
- The Data@Exascale Associate Team (2013-2018) with Argonne National Lab and the University of Illinois at Urbana-Champaign. Role: Principal Investigator. This Associate Team was born in the framework of the Joint Inria-Illinois-ANL-BSC Laboratory on Extreme Scale Computing.
- BigStorage (2015-2018) is an European Training Network (ETN) project. Area: Storage-based Convergence of HPC and Cloud infrastructures to handle Big Data. Role: local coordinator for Inria Rennes Bretagne Atlantique. Roles: Work Package leader (Data Science WP), Partner principal investigator (for Inria).
- JLESC (involved since 2009) – Joint Inria- Illinois- ANL-BSC-JSC-RIKEN/AICS Laboratory for Extreme-Scale Computing. Current role: Topic leader for Inria for the data storage, I/O and in situ processing topic, coordinating collaboration activities among the lab partners in these areas.
- Damaris 2 (2019-2021) is an ADT project funded by Inria (Action de Développement Technologique), whose goal is to extend Damaris to address the needs of Big Data analytics. Role: Project Leader.
- Z-CloudFlow (2013-2016): geographically distributed workflows on Azure clouds. Role: Principal Investigator (co-PI: Patrick-Valduriez). A project of the Microsoft-Inria Joint Research Centre.
- A-Brain (2010-2013). A project dedicated to joint neuroimaging and genetics analysis on Microsoft’s Azure cloud computing platform. Role: Principal Investigator, with Bertrand Thirion (PARIETAL team, Inria). A project of the Microsoft-Inria Joint Research Centre.
- MapReduce: an ANR Project (2010-2014) with International partners on optimized MapReduce data processing on cloud platforms: Argonne National Lab (USA), University of Illinois at Urbana-Champaign, IBM France, the Joint Inria-ANL-UIUC-BSC Lab for Petascale Computing (ex-JLPC), the AVALON Inria team, IBCP and MEDIT. Role: Principal Investigator.
- Seeding a France-Chicago Collaboration in Exascale Storage for Computational Science : (2012) FACCTS joint project with Argonne National Lab(ANL). Role: project co-Principal Investigator, with Rob Ross (ANL).
- F3PC: ANR-JST project (2010-2014). Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
- The SCALUS Marie Curie Initial Training Network, call FP7-PEOPLE-ITN-2008 (2009-2013). Role : coordinator for Inria (teams involved: KerData, Myriads). Other partners: Universidad Politécnica de Madrid, Barcelona Supercomputing Center, University of Paderborn, Ruprecht-Karls-Universität Heidelberg, Durham University, FORTH, Ecole des Mines de Nantes, XLAB, CERN, NEC, Microsoft Research, Fujitsu.
- DataCloud@work (2010+2012): an Inria Associate Team with the University “Politehnica” of Bucharest (PUB), Romania (Valentin Cristea). Role: Principal Investigator.
- Projects with Tsukuba University, Japan (Osamu Tatebe, Gfarm team):
- Bilateral PHC (ex-PAI) Sakura project (INRIA – AIST/University of Tsukuba, 2006-2007) on P2P-based data sharing. Role: Principal Investigator.
- NEGST (2006 – 2009): CNRS-JST project. Role: participant.
- Bilateral project with the University of Illinois at Urbana Champaign, USA (CNRS-INRIA-UIUC programme, 2006-2007). Role: Principal Investigator.
- GridRand: bilateral PHC Brancusi project with the Technical University of Cluj-Napoca, Romania (2009-2010). Role: Principal Investigator.
- GridDataViz: bilateral project with “Politehnica” University of Bucharest (CNRS – Romanian Academy of Science, 2008-2009). Topic: visualization and remote control of the BlobSeer data management platform using the MonALISA monitoring framework. Role: Principal Investigator.
The Grid Data-Sharing Service approach I have worked on between 2004 and 2008 has been at the center of the GDS project of the French ACI MD (2003 – 2006) and has been enhanced and validated within the LEGO and RESPIRE ANR projects (2006-2009).
GDS (2003 – 2006): ACI MD project. Role: Principal Investigator.
RESPIRE (2006 – 2008): ANR project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
LEGO (2006 – 2009): ANR project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
GdX (2003 – 2006): ACI-MD project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
Technology Development Projects
- Damaris (2016-2018) was an ADT project funded by Inria (Action de Développement Technologique), whose goal is to transform Damaris into production-level software and to develop its user community. Role: Project Leader.
- BlobSeer ADT project (2013-2015) was an ADT project funded by Inria (Action de Développement Technologique), whose goal is to develop BlobSeer into production-level software. Role: Project Leader
Note: the % represents my contribution to advising. I also serve as a PhD director (directeur de thèse) – for all PhD students listed below, except for Tien Dat Phan.
Nathanaël Cheriere: ENS Rennes, 2016–. Co-advised (50%) with Matthieu Dorier(50%). Subject: elastic storage systems.
Ovidiu Cristian Marcu: Inria/H2020 BigStorage project, INSA Rennes, 2015–. Co-advised (40%) with Alexandru Costan (30%), María Pérez (30%). Subject: Efficient data processing and streaming strategies for workflow-based Big Data processing.
Former PhD students
Pierre Matri: Universidad Politécnica de Madrid/H2020 BigStorage project, 2015–2018. Co-advised (30%) with María Pérez (70%). Subject: storage for converged HPC-Big Data systems. Now a Postdoctoral Appointee at Argonne National Lab, USA.
Mohamed Yacine Taleb: Inria/H2020 BigStorage project, ENS Rennes, 2015–2018. Co-advised (50%) with Toni Cortés (50%). Subject: energy-impact of data consistency management. Now a Research Engineer at Criteo.
Tien Dat Phan: MESR Grant, ENS Rennes, 2014–. Co-advised (30%) with Luc Bougé (40%) and Shadi Ibrahim (30%). Subject: green Big Data processing in Clouds. Now a Big Data Software Architect at Dassault Systèmes.
Orçun Yildiz: INRIA CORDI-S grant, ENS Rennes, 2014– . Co-advised (50%) with Shadi Ibrahim (50%). Subject: energy-efficient Big Data management in HPC systems. Now a Postdoctoral Appointee at Argonne National Lab, USA.
Luis Eduardo Pineda Morales: Microsoft Research Inria Joint Centre, INSA Rennes, 2013–. Co-advised (50%) with Alexandru Costan (50%). Subject: data management for distributed cloud workflows. Now a Big Data/cloud Specialist at Activeeon.
Matthieu Dorier (2011–2014). Co-advised (70%) with Luc Bougé (30%). Subject: I/O Variability in Post-Petascale HPC Simulations. Accessit to the 2015 Gilles Kahn Honorary PhD Thesis Award of the SIF and the Academy of Science (second prize). Now a Software Development Specialist (permanent position) at Argonne National Lab, USA.
Radu Tudoran (2011–2014). Co-advised (70%) with Luc Bougé (30%). Subject: Big Data management across cloud dataceners. Now a Senior Research Engineer at Huawei Technologies, Munich, Germany.
Houssem-Eddine Chihoub (2010–2013). Co-advised (70%) with María Pérez (30%). Subject: data consistency in the cloud. Now a Postdoctoral Researcher at Institut Polytechnique de Grenoble, France.
Viet-Trung Tran (2009–2012). Co-advised (70%) with Luc Bougé (30%). Subject: storage for HPC systems. Now a Lecturer at the School of Information and Communication Technology, Hanoi University of Science and Technology, Vietnam.
Alexandra Carpen-Amarie (2008 – 2011). Co-advised (70%) with Luc Bougé (30%). Subject: using the BlobSeer approach for self-adaptive cloud data management. Now a Postdoctoral researcher at TU Wien, Vienna, Austria.
Diana Moise (2008 – 2011). Co-advised (70%) avec Luc Bougé (30%). Subject : using the BlobSeer approach for efficient MapReduce processing. Now a Big Data Application Analyst at Cray, Inc., Zürich, Switzerland.
Bogdan Nicolae (2007 – 2010) – Second Gilles Kahn/SPECIF PhD Thesis Award in 2011. Co-advised (70%) with Luc Bougé (30%). Subject: the BlobSeer approach to large-scale data management for data-intensive applications. After 18 months as a postdoc at UIUC, Bogdan was hired as a Research Scientist at IBM Research Dublin in 2012. He recently joined Huawei Technologies as a Principal Research Scientist (Munich, Germany).
Loïc Cudennec (2005 – 2009). Co-advised (70%) with Luc Bougé (30%). Subject: grid application deployment. Now a Research engineer at CEA (LIST lab), Saclay, France.
Sébastien Monnet (2003 – 2006). Co-advised (50%) avec Luc Bougé (50%). Sébastien was hired in 2007 as an Associate Professor (MdC) at Université Pierre et Marie Curie, Paris, a member the REGAL team (LIP6 – INRIA Rocquencourt). He is now a Professor at Université Savoie Mont Blanc (Polytech’ Annecy-Chambéry/LISTIC), Annecy, France.
Mathieu Jan (2003 – 2006). Co-advised (50%) with Luc Bougé (50%). Subject: grid application deployment. Now a Research engineer at CEA (LIST lab), Saclay, France.
Teaching (since 2004)
ENSAI – responsible of the Big Data magagement course of the Statistics and Data Science track – lectures and practical sessions (24h/year since 2013).
ENSAI – co-responsible for two courses (Cloud Computing and for the Hadoop Technologies) for MSc in Statistics for Smart Data – lectures and practical sessions (15h/year since 2015).
EIT ICT Labs Master School, University of Rennes 1, SDS module, (10h/year since 2014).
CEA-EDF-INRIA School (2009) on emerging grid middleware standards: lectures on grid data management (9h).