Gabriel Antoniu

Senior Research Scientist, Inria

Scientific leader of the KerData research team at INRIA Rennes – Bretagne Atlantique Research Center and IRISA.

Contact details

Inria Rennes Bretagne – Atlantique
Campus Universitaire de Beaulieu, 35042, Rennes
Office: D173 (orange level)
Phone: +33 (0)2 99 84 72 44
Fax :  +33 (0)2 99 84 71 71

Research Interests

My main current research interests are related to Big Data management for large scale distributed infrastructures: clouds, exascale HPC systems, converged infrastructures for HPC and Big Data analytics

Relevant topics:

  • Decentralized management of massive data on highly distributed infrastructures: HPC, cloud, edge
  • HPC/Big Data/AI convergence: storage and processing architectures
  • Cloud data services and Big Data analytics
  • Scalable I/O and in situ visualization and processing for Exascale HPC systems
  • Scalable transparent data storage and sharing
  • Scalable distributed file systems
  • BLOB-based data management
  • Data consistency protocols

Recent Highlights

2020

  • Exascale Computing
    • Newly accepted H2020 project: EuroHPC ACROSS (2021-2024). To start in March 2021.
      • Topic: HPC/Big Data/Artificial Intelligence Cross-Stack Platform Towards Exascale
      • Budget: 600k€ funding for Inria/KerData project team.
    •  Newly accepted H2020 project : EuroHPC EUPEX – European Pilot for Exascale (2021-2025). Acceptance notified on January 25, 2021.
      • The EUPEX consortium aims to design, build, and validate the first EU platform for HPC, covering end-to-end the spectrum of required technologies with European assets: from the architecture, processor, system software, development tools to the applications.

  • HPC/AI/Big Data convergence
    • February 2020: Our first joint paper on AI obtained the Outstanding Paper Award- Special Track for Social Impact at the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20, an A*-level conference):
      • Kevin Fauvel, Daniel Balouek-Thomert, Diego Melgar, Pedro Silva, Anthony Simonet, Gabriel Antoniu, Alexandru Costan, Véronique Masson, Manish Parashar, Ivan Rodero and Alexandre Termier.  A Distributed Multi-Sensor Machine Learning Approach to Earthquake Early Warning
  • Community service in Europe and in France: ETP4HPC, BDVA, TCI, Projet Exascale
    • TransContinuum Initiative, 2020: Co-leader of the use-case analysis Working Group. BDVA-ETP4HPC Liaison Officer.
    • ETP4HPC Strategic Research Agenda (SRA, 2020): Co-leader of the working group on Programming Models and on the research clusters “Data Everywhere” and “HPC and the Digital continuum”, co-editor of the SRA document published in March 2020.
    • Projet Exascale (aiming to prepare the setup a future European Exascale Machine planned to be hosted in France): member of the Applications Working Group (2021).
  • HPS  -International Workshop on High-Performance Storage, held in conjunction with the IEEE IPDPS conference (CORE A)
    • 2021: Workshop Chair
    • 2020: Program Chair

2019 and before

Recent awards obtained by co-advised PhD students

Selected recent keynote talks, invited talks and panels

BDEC and BDEC2

BDEC2 is the second series of BDEC workshops. BDEC2 will focus on the problem that the previous series identified and analyzed, namely, the problem of defining and creating consensus around a shared cyberinfrastructure for science in the data saturated world that is now emerging. Since massive amounts of data will soon be getting generated nearly everywhere, massive amounts of computing and storage will have to be available for use at the “edge” or in the “fog,” as well in commercial Clouds and HPC centers. This is spelled out more detail in a short prospectus. BDEC workshops are closed and invitation-only. BDEC invites representatives of leading application communities to participate in one or more of the workshops and contribute to the BDEC prospective documents.

Invited talks at other international events, panels

  • Invited talk at the BenchCouncil 2019 conference
  • Invited talk at CIC-IPN, Mexico, October 2019: From Big Data to Fast Data.
  • Invited talk at Sintef: Convergence of HPC and Big Data: a Vision , Oslo, October 2018.
  • 6 Invited talks at successive editions (2012 – 2019) of the workshop of the Joint Laboratory for Extreme-Scale Computing (JLESC).
  • Invited talk at the Huawei European Research Center, Munich, in January 2018. Low-latency Storage for Stream Data.
  • Invited talk at the Huawei European Research Center, Munich in October 2017. Convergence of HPC and Big Data.
  • Keynote talk at the BigStorage and WALL ITN Joint Meeting, Mainz, January 2017. Týr: Storage-based Convergence Between HPC and Big Data.
  • Invited Panelist at the Panel discussion on the HPC and Big Data convergence organized with the IEEE Cluster 2016 conference (Taipei, 2016).
  • Inria/CIC-IPN workshop: Scalable Big Data Processing on Clouds: A-Brain and Z- CloudFlow, Centro de Investigación y Computación, Instituto Politécnico Nacional, Mexico City, November 2016.
  • First Chinese-French Workshop on Extreme Computing: Damaris: Jitter-Free I/O Management and In Situ Visualization of HPC Simulations using Dedicated Cores, Guangzhou, May 2016.

Community service

Management roles

  • International lab managementJLESC – Joint Inria- Illinois- ANL-BSC-JSC-RIKEN/AICS Laboratory for Extreme-Scale Computing
    • Vice Executive Director for Inria.
    • JLESC Topic leader for Inria for data storage, I/O and in situ processing, coordinating JLESC collaboration activities in this area.
  • Team management
  • International Associate Team management
    • UNIFY Associate Team with Argonne National Lab (USA, 2019-2021),
    • Data@Exascale Associate Team with Argonne National Lab and the University of Illinois at Urbana Champaign (USA, 2013-2018),
    • DataCloud@work Associate Team with University Politehnica of Bucharest (UPB), Romania (2010-2012).
  • Research project management
    • Coordinator of 12 projects: 5 with industry (2 with Microsoft within the Inria-Microsoft Research Joint Research Centre, 1 with Huawei, 1 with Total, 1 with Sun Microsystems), 2 ANR projects, 5 bilateral projects with international partners in USA, Japan, Romania.
    • Partner PI (coordinator for Inria) in 5 other projects (2 European, 2 ANR national projects, 1 ANR-JST international project).
  • Technology development project management
    • Coordinator of 3 projects (ADT Blobseer, ADT Damaris, ADT Damaris 2 (2018–2021).

Organization of scientific events

  • Program Chair, Vice Chair, Track Chair of CORE A international conferences
    • IEEE Cluster 
      • Program Chair in 2017, with Richard Vuduc from Georgia Tech, USA ,as a Co-Chair (Honolulu).
      • Program Chair in 2014 with Kate Keahey from Argonne National Lab, USA, as a Co-Chair (Madrid).
      • Track Chair for the Data, storage and Visualization Track in 2015 (Chicago).
    • ACM/IEEE SC
      • Track Chair for the Clouds and Distributed Systems Area for the 2019 edition.
    • ACM/IEEE CCGrid 
      • PC Vice-Chair for the Hybrid and Mobile Clouds Area: 2016 and 2017.
    • Euro-Par 
      • Track Chair in 2011.
  • EIT Digital Future Cloud Symposium: Program Co-Chair of the EIT Digital Future Cloud Symposium, Rennes, October 2015.
  • Other chairing roles for international conferences
    • 3PGCIC: Track Chair for the Distributed Algorithms Track in 2015.
    • IEEE CloudCom: Track Chair for the Map-Reduce Track: 2011 and 2012.
    • Publicity Chair for international conferences: IEEE/ACM CCGRID 2013, ACM HPDC 2012, Euro-Par 2007.
  • Scientific workshops
    • HPS 2019 – First International Workshop on High-Performance Storage, held in conjunction with the IEEE IPDPS 2019 conference (CORE A): Program Chair
    • ScienceCloud – International workshop on Scientific Cloud Computing held every year in conjunction with the ACM HPDC conference (CORE A): General Co-Chair in 2012 (Delft), Program Chair in 2013 (New York);
    • JLPC Local Chair for the 7th workshop of the Joint Laboratory for Petascale Computing (JLPC), which later becameJLESC: the Joint Laboratory for Extreme Scale Computing, Rennes, 2012.

 

Participation to journal editorial boards

  • Journal of Parallel and Distributed Computing – JPDC – Elsevier (CORE A): Associate Editor since 2019.
  • Future Generation Computer Systems (CORE A): Special Issue on Mobile, hybrid, and heterogeneous clouds for cyberinfrastructures (Guest Editor, 2018), Special Issue on Resource Management for Big Data Platforms (Guest Editor, 2018)
  • Concurrency and Computation (CORE A): Practice and Experience: Special issue on the Cloud computing for data-driven science and engineering workshop (Guest Editor, 2016).

Best Papers and Best Posters Committees

Program Committees of recent CORE A international conferences with physical meetings

ACM HPDC 2012-2018, IEEE IPDPS 2018, 2019, ACM/IEEE SC (2013, 2015, 2017, 2018: Papers Committee; 2014, 2018: Posters Committee).

Program Committees of recent international conferences without physical meetings

ICDE 2017 (CORE A+), IEEE Cluster 2008, 2016 (CORE A), IEEE/ACM CCGRID 2013 and 2015 (CORE A), Euro-Par 2010-2012, 2015 (CORE A), IEEE HPCC 2012 (CORE B), IEEE AINA 2011-2012 (CORE B). (Workshops are omitted.)

Short bio

Gabriel Antoniu is a Senior Research Scientist at Inria, Rennes, where he leads the KerData research team. His recent research interests include scalable storage, I/O and in situ visualization, data processing architectures favoring the convergence of HPC, Big Data analytics and AI. He has served as a PI for several international projects in these areas in partnership with Microsoft Research, IBM, ATOS/BULL, Argonne National Lab, the University of Illinois at Urbana Champaign, Universidad Politécnica de Madrid, Barcelona Supercomputing Center. He served as Program Chair for the IEEE Cluster conference in 2014 and 2017 and regularly serves as a PC member of major conferences in the area of HPC, cloud computing and Big Data analytics (SC, HPDC, CCGRID, Cluster, Big Data, etc.).  He has acted as advisor for 20 PhD theses and has co-authored over 150 international publications in the aforementioned areas.

Publications

A list of publications can be found on DBLP.

A comprehensive list of my publications can be found on HAL Open Archives Library:

Software

  • BlobSeer is a data management platform we are currently developing for sharing massive data at very large scales. It originally relies on advanced techniques for decentralized data management and versioning techniques to provide scalable data throughput under heavy data access concurrency.
  • Damaris: Damaris is a middleware for multicore SMP nodes allowing them to efficiently handle data transfers for storage and visualization by dedicating one or a few cores to the application I/O or for in situ visualization. It has been developed within the framework of a collaboration with the Joint Laboratory for Extreme-Scale Computing (JLESC, ex-JLPC). It was successfully evaluated with the CM1 tornado simulation, one of the Blue Waters target applications, on several supercomputers (Titan, Jaguar, Kraken), where it demonstrated excellent scalability.
  • JuxMem: is a platform which illustrates the concept of Grid Data-Sharing Service, defined using a hybrid approach based on Distributed Shared Memory and Peer-to-Peer techniques.

Leadership roles in ongoing projects

  • JLESC (involved since 2009)Joint Inria- Illinois- ANL-BSC-JSC-RIKEN/AICS Laboratory for Extreme-Scale Computing. Current role:  Topic leader for Inria for the data storage, I/O and in situ processing topic, coordinating collaboration activities among the lab partners in these areas.
  • The UNIFY Associate Team (2019-2021) with Argonne National Lab. Role: Principal Investigator. Topic: Intelligent Unified Data Services for Hybrid Workflows Combining Compute-Intensive Simulations and Data-Intensive Analytics at Extreme Scales. This Associate Team was born in the framework of the Joint Inria-Illinois-ANL-BSC Laboratory on Extreme Scale Computing as a follow-up team to the Data@Exascale Associate Team.
  • Damaris 2 (2019-2021) is an ADT project funded by Inria (Action de Développement Technologique), whose goal is to extend  Damaris to address the needs of Big Data analytics. Role: Project Leader.

Previous projects

International Projects

  • The Data@Exascale Associate Team (2013-2018) with Argonne National Lab and the University of Illinois at Urbana-Champaign. Role: Principal Investigator. This Associate Team was born in the framework of the Joint Inria-Illinois-ANL-BSC Laboratory on Extreme Scale Computing.
  • BigStorage (2015-2018) is an European Training Network (ETN) project. Area: Storage-based Convergence of HPC and Cloud infrastructures to handle Big Data. Role: local coordinator for Inria Rennes Bretagne Atlantique. Roles: Work Package leader (Data Science WP), Partner principal investigator (for Inria).
  • Z-CloudFlow (2013-2016): geographically distributed workflows on Azure clouds. Role: Principal Investigator (co-PI: Patrick-Valduriez). A project of the Microsoft-Inria Joint Research Centre.
  • A-Brain (2010-2013). A project dedicated to joint neuroimaging and genetics analysis on Microsoft’s Azure cloud computing platform. Role: Principal Investigator, with Bertrand Thirion (PARIETAL team, Inria). A project of the Microsoft-Inria Joint Research Centre.
  • MapReduce: an ANR Project (2010-2014) with International partners on optimized MapReduce data processing on cloud platforms: Argonne National Lab (USA), University of Illinois at Urbana-Champaign, IBM France, the Joint Inria-ANL-UIUC-BSC Lab for Petascale Computing (ex-JLPC), the AVALON Inria team, IBCP and MEDIT. Role: Principal Investigator.
  • Seeding a France-Chicago Collaboration in Exascale Storage for Computational Science : (2012) FACCTS joint project with Argonne National Lab(ANL). Role: project co-Principal Investigator, with Rob Ross (ANL).
  • F3PC: ANR-JST project (2010-2014). Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
  • The SCALUS Marie Curie Initial Training Network, call FP7-PEOPLE-ITN-2008 (2009-2013). Role : coordinator for Inria (teams involved: KerData, Myriads). Other partners: Universidad Politécnica de Madrid, Barcelona Supercomputing Center, University of Paderborn, Ruprecht-Karls-Universität Heidelberg, Durham University, FORTH, Ecole des Mines de Nantes, XLAB, CERN, NEC, Microsoft Research, Fujitsu.
  • DataCloud@work (2010+2012): an Inria Associate Team with the University “Politehnica” of Bucharest (PUB), Romania (Valentin Cristea). Role: Principal Investigator.
  • Projects with Tsukuba University, Japan (Osamu Tatebe, Gfarm team):
    • Bilateral PHC (ex-PAI) Sakura project (INRIA – AIST/University of Tsukuba, 2006-2007) on P2P-based data sharing. Role: Principal Investigator.
    • NEGST (2006 – 2009): CNRS-JST project. Role: participant.
  • Bilateral project with the University of Illinois at Urbana Champaign, USA (CNRS-INRIA-UIUC programme, 2006-2007). Role: Principal Investigator.
  • GridRand: bilateral PHC Brancusi project with the Technical University of Cluj-Napoca, Romania (2009-2010). Role: Principal Investigator.
  • GridDataViz: bilateral project with “Politehnica” University of Bucharest (CNRS – Romanian Academy of Science, 2008-2009). Topic: visualization and remote control of the BlobSeer data management platform using the MonALISA monitoring framework. Role: Principal Investigator.

National Projects

The Grid Data-Sharing Service approach I have worked on between 2004 and 2008 has been at the center of the GDS project of the French ACI MD (2003 – 2006) and has been enhanced and validated within the LEGO and RESPIRE ANR projects (2006-2009).

  • GDS (2003 – 2006): ACI MD project. Role: Principal Investigator.
  • RESPIRE (2006 – 2008): ANR project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
  • LEGO (2006 – 2009): ANR project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
  • GdX (2003 – 2006): ACI-MD project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.

Technology Development Projects

  • Damaris (2016-2018) was an ADT project funded by Inria (Action de Développement Technologique), whose goal is to transform  Damaris into production-level software and to develop its user community. Role: Project Leader.
  • BlobSeer ADT project (2013-2015) was an ADT project funded by Inria (Action de Développement Technologique), whose goal is to develop  BlobSeer into production-level software. Role: Project Leader

 


Advised PhD students

Note: the % represents my contribution to advising. I also serve as a PhD director (directeur de thèse) – for all PhD students listed below, except for Tien Dat Phan.

Thomas Bouvier (2021-2024) co-advised with Alexandru Costan (INSA Rennes). Topic: reproducible deployment and scheduling strategies for AI workloads on the Digital Continuum.

Daniel Rosendo (2019-2022) co-advised with Gabriel Antoniu (Inria) and Patrick Valduriez (Inria). Topic: enabling HPC-Big Data Convergence for Intelligent Extreme-Scale Analytics.

Former PhD students

Nathanaël Cheriere: ENS Rennes, 2016–2019. Co-advised (50%) with Matthieu Dorier(50%). Subject: elastic storage systems.

Ovidiu Cristian Marcu: Inria/H2020 BigStorage project, INSA Rennes, 2015–2018. Co-advised (40%) with Alexandru Costan (30%), María Pérez (30%). Subject: Efficient data processing and streaming strategies for workflow-based Big Data processing.

Pierre Matri: Universidad Politécnica de Madrid/H2020 BigStorage project, 2015–2018. Co-advised (30%) with María Pérez (70%). Subject: storage for converged HPC-Big Data systems. Now a Postdoctoral Appointee at Argonne National Lab, USA.

Mohamed Yacine Taleb: Inria/H2020 BigStorage project, ENS Rennes, 2015–2018. Co-advised (50%) with Toni Cortés (50%). Subject: energy-impact of data consistency management. Now a Research Engineer at Criteo.

Tien Dat Phan: MESR Grant, ENS Rennes, 2014–. Co-advised (30%) with Luc Bougé (40%) and Shadi Ibrahim (30%). Subject: green Big Data processing in Clouds. Now a Big Data Software Architect at Dassault Systèmes.

Orçun Yildiz: INRIA CORDI-S grant, ENS Rennes, 2014– . Co-advised (50%) with Shadi Ibrahim (50%). Subject: energy-efficient Big Data management in HPC systems. Now a Postdoctoral Appointee at Argonne National Lab, USA.

Luis Eduardo Pineda Morales: Microsoft Research Inria Joint Centre, INSA Rennes, 2013–. Co-advised (50%) with Alexandru Costan (50%). Subject: data management for distributed cloud workflows. Now a Big Data/cloud Specialist at Activeeon.

Matthieu Dorier (2011–2014). Co-advised (70%) with Luc Bougé (30%). Subject: I/O Variability in Post-Petascale HPC Simulations. Accessit to the 2015 Gilles Kahn Honorary PhD Thesis Award of the SIF and the Academy of Science (second prize). Now a Software Development Specialist (permanent position) at Argonne National Lab, USA.

Radu Tudoran (2011–2014). Co-advised (70%) with Luc Bougé (30%). Subject: Big Data management across cloud dataceners. Now a Senior Research Engineer at Huawei Technologies, Munich, Germany.

Bunjamin Memishi (2011–2015). Co-advised (30%) with María Pérez (70%). Subject: reliable MapReduce processing. Now a postdoctoral researcher at DLR (the German Aerospace Center).

Houssem-Eddine Chihoub (2010–2013). Co-advised (70%) with María Pérez (30%). Subject: data consistency in the cloud. Now a Postdoctoral Researcher at Institut Polytechnique de Grenoble, France.

Viet-Trung Tran (2009–2012). Co-advised (70%) with Luc Bougé (30%). Subject: storage for HPC systems. Now a Lecturer at the School of Information and Communication Technology, Hanoi University of Science and Technology, Vietnam.

Alexandra Carpen-Amarie (2008 – 2011). Co-advised (70%) with Luc Bougé (30%). Subject: using the BlobSeer approach for self-adaptive cloud data management. Now a Postdoctoral researcher at TU Wien, Vienna, Austria.

Diana Moise (2008 – 2011). Co-advised (70%) avec Luc Bougé (30%). Subject : using the BlobSeer approach for efficient MapReduce processing. Now a Big Data Application Analyst at Cray, Inc., Zürich, Switzerland.

Bogdan Nicolae (2007 – 2010) Second Gilles Kahn/SPECIF PhD Thesis Award in 2011. Co-advised (70%) with Luc Bougé (30%). Subject: the BlobSeer approach to large-scale data management for data-intensive applications. After 18 months as a postdoc at UIUC, Bogdan was hired as a Research Scientist at IBM Research Dublin in 2012. He recently joined Huawei Technologies as a Principal Research Scientist (Munich, Germany).

Loïc Cudennec (2005 – 2009). Co-advised (70%) with Luc Bougé (30%). Subject: grid application deployment. Now a Research engineer at CEA (LIST lab), Saclay, France.

Sébastien Monnet (2003 – 2006). Co-advised (50%) avec Luc Bougé (50%). Sébastien was hired in 2007 as an Associate Professor (MdC) at Université Pierre et Marie Curie, Paris, a member the REGAL team  (LIP6 – INRIA Rocquencourt). He is now a Professor at Université Savoie Mont Blanc (Polytech’ Annecy-Chambéry/LISTIC), Annecy, France.

Mathieu Jan (2003 – 2006). Co-advised (50%) with Luc Bougé (50%). Subject: grid application deployment. Now a Research engineer at CEA (LIST lab), Saclay, France.


Teaching (since 2009)

I am currently in charge of master-level modules at the University of Rennes 1, ENSAI, EIT Digital Master School (Rennes):

  • ENSAI – Big Data magagement course of the Statistics and Data Science track – lectures and practical sessions (24h/year since 2013).
  • Master Informatics Science of Rennes – BSI module, lectures on Big Data infrastructures (10h/year, since 2017).
  • MIAGE Master of the University Rennes 1 –  Cloud for Big Data module (lectures on Big Data and cloud computing, 14h/year, since 2017).
  • Cloud and Networking Infrastructures Master Program, University of Rennes 1, IBD module, lectures on cloud computing and Big Data (10h/year since 2014).

Past Teaching activities

  • University of Nantes, ALMA Master, Distributed Architectures module – AD (2009-2016), lectures on grid, P2P and cloud data management (8-10h/year) and few hours of supervised work (TD) and practical sessions (TP) per year.
  • Ecole Supérieure d’Informatique, Electronique, Automatique, 5th year, full grid and cloud computing module (2009-2013), lectures (18h/year).
  • ENSAI – co-responsible for two courses (Cloud Computing and for the Hadoop Technologies) for MSc in Statistics for Smart Data – lectures and practical sessions (15h/year 2015-2017).