Gabriel Antoniu

Senior Research Scientist, Inria

Scientific leader of the KerData research team at INRIA Rennes – Bretagne Atlantique Research Center and IRISA.

Contact details

Inria Rennes Bretagne – Atlantique
Campus Universitaire de Beaulieu, 35042, Rennes
Office: D173 (orange level)
Phone: +33 (0)2 99 84 72 44
Fax :  +33 (0)2 99 84 71 71

Research Interests

My main current research interests are related to Big Data management for large scale distributed infrastructures: clouds, exascale HPC systems, converged infrastructures for HPC and Big Data analytics

Relevant topics:

  • Decentralized management of massive data on highly distributed infrastructures;
  • Cloud data services and Big Data analytics;
  • Scalable I/O and in situ visualization and processing for Exascale HPC systems;
  • Convergence of Big Data and HPC: storage and processing architectures;
  • Scalable transparent data storage and sharing;
  • Scalable distributed file systems;
  • BLOB-based data management;
  • Data consistency protocols.

Recent Highlights

Recent awards obtained by co-advised PhD students

Selected recent keynote talks, invited talks and panels

BDEC and BDEC2

BDEC2 is the second series of BDEC workshops. BDEC2 will focus on the problem that the previous series identified and analyzed, namely, the problem of defining and creating consensus around a shared cyberinfrastructure for science in the data saturated world that is now emerging. Since massive amounts of data will soon be getting generated nearly everywhere, massive amounts of computing and storage will have to be available for use at the “edge” or in the “fog,” as well in commercial Clouds and HPC centers. This is spelled out more detail in a short prospectus. BDEC workshops are closed and invitation-only. BDEC invites representatives of leading application communities to participate in one or more of the workshops and contribute to the BDEC prospective documents.

  • BDEC2:  The Sigma Data Processing Architecture: Leveraging Future Data for Extreme-Scale Data Analytics to Enable High-Precision Decisions. Invited talk at the 1st workshop of the BDEC2 series, Bloomington, November 2018. White paper available here.
  • BDEC: Big Data and Extreme-Scale Computing: a Storage-Based Pathway to Convergence. Invited keynote talk at the 4th BDEC workshop, Frankfurt, June 2016.

Invited talks at other international events, panels

  • Invited talk at Sintef: Convergence of HPC and Big Data: a Vision , Oslo, October 2018.
  • 5 Invited talks at successive editions (2012 – 2018) of the workshop of the Joint Laboratory for Extreme-Scale Computing (JLESC).
  • Invited talk at the Huawei European Research Center, Munich, in January 2018. Low-latency Storage for Stream Data.
  • Invited talk at the Huawei European Research Center, Munich in October 2017. Convergence of HPC and Big Data.
  • Keynote talk at the BigStorage and WALL ITN Joint Meeting, Mainz, January 2017. Týr: Storage-based Convergence Between HPC and Big Data.
  • Invited Panelist at the Panel discussion on the HPC and Big Data convergence organized with the IEEE Cluster 2016 conference (Taipei, 2016).
  • Inria/CIC-IPN workshop: Scalable Big Data Processing on Clouds: A-Brain and Z- CloudFlow, Centro de Investigación y Computación, Instituto Politécnico Nacional, Mexico City, November 2016.
  • First Chinese-French Workshop on Extreme Computing: Damaris: Jitter-Free I/O Management and In Situ Visualization of HPC Simulations using Dedicated Cores, Guangzhou, May 2016.

Community service

Management roles

  • International lab managementJLESC – Joint Inria- Illinois- ANL-BSC-JSC-RIKEN/AICS Laboratory for Extreme-Scale Computing
    • Vice Executive Director for Inria.
    • JLESC Topic leader for Inria for data storage, I/O and in situ processing, coordinating JLESC collaboration activities in this area.
  • Team management
  • International Associate Team management
    • Data@Exascale Associate Team with Argonne National Lab and the University of Illinois at Urbana Champaign (USA, 2013-2018),
    • DataCloud@work Associate Team with University Politehnica of Bucharest (UPB), Romania (2010-2012).
  • Research project management
    • Coordinator of 12 projects: 5 with industry (2 with Microsoft within the Inria-Microsoft Research Joint Research Centre, 1 with Huawei, 1 with Total, 1 with Sun Microsystems), 2 ANR projects, 5 bilateral projects with international partners in USA, Japan, Romania.
    • Partner PI (coordinator for Inria) in 5 other projects (2 European, 2 ANR national projects, 1 ANR-JST international project).
  • Technology development project management
    • Coordinator of 3 projects (ADT Blobseer, ADT Damaris, ADT Damaris 2 (2018–2021).

Organization of scientific events

  • Program Chair, Vice Chair, Track Chair of CORE A international conferences
    • IEEE Cluster 
      • Program Chair in 2017, with Richard Vuduc from Georgia Tech, USA ,as a Co-Chair (Honolulu).
      • Program Chair in 2014 with Kate Keahey from Argonne National Lab, USA, as a Co-Chair (Madrid).
      • Track Chair for the Data, storage and Visualization Track in 2015 (Chicago).
    • ACM/IEEE SC
      • Track Chair for the Clouds and Distributed Systems Area for the 2019 edition.
    • ACM/IEEE CCGrid 
      • PC Vice-Chair for the Hybrid and Mobile Clouds Area: 2016 and 2017.
    • Euro-Par 
      • Track Chair in 2011.
  • EIT Digital Future Cloud Symposium: Program Co-Chair of the EIT Digital Future Cloud Symposium, Rennes, October 2015.
  • Other chairing roles for international conferences
    • 3PGCIC: Track Chair for the Distributed Algorithms Track in 2015.
    • IEEE CloudCom: Track Chair for the Map-Reduce Track: 2011 and 2012.
    • Publicity Chair for international conferences: IEEE/ACM CCGRID 2013, ACM HPDC 2012, Euro-Par 2007.
  • Scientific workshops
    • ScienceCloud – International workshop on Scientific Cloud Computing held every year in conjunction with the ACM HPDC conference (CORE A): General Co-Chair in 2012 (Delft), Program Chair in 2013 (New York);
    • JLPC Local Chair for the 7th workshop of the Joint Laboratory for Petascale Computing (JLPC), which later becameJLESC: the Joint Laboratory for Extreme Scale Computing, Rennes, 2012.

 

Participation to journal editorial boards

  • Future Generation Computer Systems (CORE A): Special Issue on Mobile, hybrid, and heterogeneous clouds for cyberinfrastructures (Guest Editor, 2018), Special Issue on Resource Management for Big Data Platforms (Guest Editor, 2018)
  • Concurrency and Computation (CORE A): Practice and Experience: Special issue on the Cloud computing for data-driven science and engineering workshop (Guest Editor, 2016).

Best Papers and Best Posters Committees

Program Committees of recent CORE A international conferences with physical meetings

ACM HPDC 2012-2018, IEEE IPDPS 2018, 2019, ACM/IEEE SC (2013, 2015, 2017, 2018: Papers Committee; 2014, 2018: Posters Committee).

Program Committees of recent international conferences without physical meetings

ICDE 2017 (CORE A+), IEEE Cluster 2008, 2016 (CORE A), IEEE/ACM CCGRID 2013 and 2015 (CORE A), Euro-Par 2010-2012, 2015 (CORE A), IEEE HPCC 2012 (CORE B), IEEE AINA 2011-2012 (CORE B). (Workshops are omitted.)

Short bio

Gabriel Antoniu is a Senior Research Scientist at Inria, Rennes. He leads the KerData research team, focusing on storage and I/O management for Big Data processing on scalable infrastructures (clouds, HPC systems).  He currently serves as Vice Executive Director of JLESC – Joint Inria- Illinois- ANL-BSC-JSC-RIKEN/AICS Laboratory for Extreme-Scale Computing on behalf of Inria. He received his Ph.D. degree in Computer Science in 2001 from ENS Lyon. He leads several international projects in partnership with Microsoft Research, IBM, Argonne National Lab, the University of Illinois at Urbana Champaign, Huawei. He served as Program Chair for the IEEE Cluster conference in 2014 and 2017 and regularly serves as a PC member of major conferences in the area of HPC, cloud computing and Big Data (SC, HPDC, CCGRID, Cluster, Big Data, etc.). He has acted as advisor for 18 PhD theses and has co-authored over 140 international publications in the aforementioned areas.

Publications

A list of publications can be found on DBLP.

A comprehensive list of my publications can be found on HAL Open Archives Library:

Software

  • BlobSeer is a data management platform we are currently developing for sharing massive data at very large scales. It originally relies on advanced techniques for decentralized data management and versioning techniques to provide scalable data throughput under heavy data access concurrency.
  • Damaris: Damaris is a middleware for multicore SMP nodes allowing them to efficiently handle data transfers for storage and visualization by dedicating one or a few cores to the application I/O or for in situ visualization. It has been developed within the framework of a collaboration with the Joint Laboratory for Extreme-Scale Computing (JLESC, ex-JLPC). It was successfully evaluated with the CM1 tornado simulation, one of the Blue Waters target applications, on several supercomputers (Titan, Jaguar, Kraken), where it demonstrated excellent scalability.
  • JuxMem: is a platform which illustrates the concept of Grid Data-Sharing Service, defined using a hybrid approach based on Distributed Shared Memory and Peer-to-Peer techniques.

Leadership roles in ongoing projects

  • The Data@Exascale Associate Team (2013-2018) with Argonne National Lab and the University of Illinois at Urbana-Champaign. Role: Principal Investigator. This Associate Team was born in the framework of the Joint Inria-Illinois-ANL-BSC Laboratory on Extreme Scale Computing.
  • BigStorage (2015-2018) is an European Training Network (ETN) project. Area: Storage-based Convergence of HPC and Cloud infrastructures to handle Big Data. Role: local coordinator for Inria Rennes Bretagne Atlantique. Roles: Work Package leader (Data Science WP), Partner principal investigator (for Inria).
  • JLESC (involved since 2009)Joint Inria- Illinois- ANL-BSC-JSC-RIKEN/AICS Laboratory for Extreme-Scale Computing. Current role:  Topic leader for Inria for the data storage, I/O and in situ processing topic, coordinating collaboration activities among the lab partners in these areas.
  • Damaris 2 (2019-2021) is an ADT project funded by Inria (Action de Développement Technologique), whose goal is to extend  Damaris to address the needs of Big Data analytics. Role: Project Leader.

Previous projects

International Projects

  • Z-CloudFlow (2013-2016): geographically distributed workflows on Azure clouds. Role: Principal Investigator (co-PI: Patrick-Valduriez). A project of the Microsoft-Inria Joint Research Centre.
  • A-Brain (2010-2013). A project dedicated to joint neuroimaging and genetics analysis on Microsoft’s Azure cloud computing platform. Role: Principal Investigator, with Bertrand Thirion (PARIETAL team, Inria). A project of the Microsoft-Inria Joint Research Centre.
  • MapReduce: an ANR Project (2010-2014) with International partners on optimized MapReduce data processing on cloud platforms: Argonne National Lab (USA), University of Illinois at Urbana-Champaign, IBM France, the Joint Inria-ANL-UIUC-BSC Lab for Petascale Computing (ex-JLPC), the AVALON Inria team, IBCP and MEDIT. Role: Principal Investigator.
  • Seeding a France-Chicago Collaboration in Exascale Storage for Computational Science : (2012) FACCTS joint project with Argonne National Lab(ANL). Role: project co-Principal Investigator, with Rob Ross (ANL).
  • F3PC: ANR-JST project (2010-2014). Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
  • The SCALUS Marie Curie Initial Training Network, call FP7-PEOPLE-ITN-2008 (2009-2013). Role : coordinator for Inria (teams involved: KerData, Myriads). Other partners: Universidad Politécnica de Madrid, Barcelona Supercomputing Center, University of Paderborn, Ruprecht-Karls-Universität Heidelberg, Durham University, FORTH, Ecole des Mines de Nantes, XLAB, CERN, NEC, Microsoft Research, Fujitsu.
  • DataCloud@work (2010+2012): an Inria Associate Team with the University “Politehnica” of Bucharest (PUB), Romania (Valentin Cristea). Role: Principal Investigator.
  • Projects with Tsukuba University, Japan (Osamu Tatebe, Gfarm team):
    • Bilateral PHC (ex-PAI) Sakura project (INRIA – AIST/University of Tsukuba, 2006-2007) on P2P-based data sharing. Role: Principal Investigator.
    • NEGST (2006 – 2009): CNRS-JST project. Role: participant.
  • Bilateral project with the University of Illinois at Urbana Champaign, USA (CNRS-INRIA-UIUC programme, 2006-2007). Role: Principal Investigator.
  • GridRand: bilateral PHC Brancusi project with the Technical University of Cluj-Napoca, Romania (2009-2010). Role: Principal Investigator.
  • GridDataViz: bilateral project with “Politehnica” University of Bucharest (CNRS – Romanian Academy of Science, 2008-2009). Topic: visualization and remote control of the BlobSeer data management platform using the MonALISA monitoring framework. Role: Principal Investigator.

National Projects

The Grid Data-Sharing Service approach I have worked on between 2004 and 2008 has been at the center of the GDS project of the French ACI MD (2003 – 2006) and has been enhanced and validated within the LEGO and RESPIRE ANR projects (2006-2009).

  • GDS (2003 – 2006): ACI MD project. Role: Principal Investigator.
  • RESPIRE (2006 – 2008): ANR project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
  • LEGO (2006 – 2009): ANR project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
  • GdX (2003 – 2006): ACI-MD project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.

Technology Development Projects

  • Damaris (2016-2018) was an ADT project funded by Inria (Action de Développement Technologique), whose goal is to transform  Damaris into production-level software and to develop its user community. Role: Project Leader.
  • BlobSeer ADT project (2013-2015) was an ADT project funded by Inria (Action de Développement Technologique), whose goal is to develop  BlobSeer into production-level software. Role: Project Leader

 


Advised PhD students

Note: the % represents my contribution to advising. I also serve as a PhD director (directeur de thèse) – for all PhD students listed below, except for Tien Dat Phan.

Nathanaël Cheriere: ENS Rennes, 2016–. Co-advised (50%) with Matthieu Dorier(50%). Subject: elastic storage systems.

Ovidiu Cristian Marcu: Inria/H2020 BigStorage project, INSA Rennes, 2015–. Co-advised (40%) with Alexandru Costan (30%), María Pérez (30%). Subject: Efficient data processing and streaming strategies for workflow-based Big Data processing.

Former PhD students

Pierre Matri: Universidad Politécnica de Madrid/H2020 BigStorage project, 2015–2018. Co-advised (30%) with María Pérez (70%). Subject: storage for converged HPC-Big Data systems.

Mohamed Yacine Taleb: Inria/H2020 BigStorage project, ENS Rennes, 2015–2018. Co-advised (50%) with Toni Cortés (50%). Subject: energy-impact of data consistency management.

Tien Dat Phan: MESR Grant, ENS Rennes, 2014–. Co-advised (30%) with Luc Bougé (40%) and Shadi Ibrahim (30%). Subject: green Big Data processing in Clouds.

Orçun Yildiz: INRIA CORDI-S grant, ENS Rennes, 2014– . Co-advised (50%) with Shadi Ibrahim (50%). Subject: energy-efficient Big Data management in HPC systems.

Luis Eduardo Pineda Morales: Microsoft Research Inria Joint Centre, INSA Rennes, 2013–. Co-advised (50%) with Alexandru Costan (50%). Subject: data management for distributed cloud workflows.

Matthieu Dorier (2011–2014). Co-advised (70%) with Luc Bougé (30%). Subject: I/O Variability in Post-Petascale HPC Simulations. Accessit to the 2015 Gilles Kahn Honorary PhD Thesis Award of the SIF and the Academy of Science (second prize). Now a Postdoctoral Appointee at Argonne National Lab, USA.

Radu Tudoran (2011–2014). Co-advised (70%) with Luc Bougé (30%). Subject: Big Data management across cloud dataceners. Now a Senior Research Engineer at Huawei Technologies, Munich, Germany.

Bunjamin Memishi (2011–2015). Co-advised (30%) with María Pérez (70%). Subject: reliable MapReduce processing.

Houssem-Eddine Chihoub (2010–2013). Co-advised (70%) with María Pérez (30%). Subject: data consistency in the cloud. Now a Postdoctoral Researcher at Institut Polytechnique de Grenoble, France.

Viet-Trung Tran (2009–2012). Co-advised (70%) with Luc Bougé (30%). Subject: storage for HPC systems. Now a Lecturer at the School of Information and Communication Technology, Hanoi University of Science and Technology, Vietnam.

Alexandra Carpen-Amarie (2008 – 2011). Co-advised (70%) with Luc Bougé (30%). Subject: using the BlobSeer approach for self-adaptive cloud data management. Now a Postdoctoral researcher at TU Wien, Vienna, Austria.

Diana Moise (2008 – 2011). Co-advised (70%) avec Luc Bougé (30%). Subject : using the BlobSeer approach for efficient MapReduce processing. Now a Big Data Application Analyst at Cray, Inc., Zürich, Switzerland.

Bogdan Nicolae (2007 – 2010) Second Gilles Kahn/SPECIF PhD Thesis Award in 2011. Co-advised (70%) with Luc Bougé (30%). Subject: the BlobSeer approach to large-scale data management for data-intensive applications. After 18 months as a postdoc at UIUC, Bogdan was hired as a Research Scientist at IBM Research Dublin in 2012. He recently joined Huawei Technologies as a Principal Research Scientist (Munich, Germany).

Loïc Cudennec (2005 – 2009). Co-advised (70%) with Luc Bougé (30%). Subject: grid application deployment. Now a Research engineer at CEA (LIST lab), Saclay, France.

Sébastien Monnet (2003 – 2006). Co-advised (50%) avec Luc Bougé (50%). Sébastien was hired in 2007 as an Associate Professor (MdC) at Université Pierre et Marie Curie, Paris, a member the REGAL team  (LIP6 – INRIA Rocquencourt). He is now a Professor at Université Savoie Mont Blanc (Polytech’ Annecy-Chambéry/LISTIC), Annecy, France.

Mathieu Jan (2003 – 2006). Co-advised (50%) with Luc Bougé (50%). Subject: grid application deployment. Now a Research engineer at CEA (LIST lab), Saclay, France.


Teaching (since 2004)

ENSAI – responsible of the Big Data magagement course of the Statistics and Data Science track – lectures and practical sessions (24h/year since 2013).

ENSAI – co-responsible for two courses (Cloud Computing and for the Hadoop Technologies) for MSc in Statistics for Smart Data – lectures and practical sessions (15h/year since 2015).

EIT ICT Labs Master School, University of Rennes 1, SDS module, (10h/year since 2014).

University of Nantes, ALMA Master, Distributed Architectures module – AD (since 2009), lectures on grid, P2P and cloud data management (8-10h/year) and few hours of supervised work (TD) and practical sessions (TP) per year.
ENS Cachan – Antenne de Bretagne, Master M2RI, SDS (ex-PAP) module (2004-2015), lectures on P2P systems (5-10h/year).
Ecole Supérieure d’Informatique, Electronique, Automatique, 5th year, full grid and cloud computing module (2009-2013), lectures (18h/year).

CEA-EDF-INRIA School (2009) on emerging grid middleware standards: lectures on grid data management (9h).

INSA de Rennes, CS Department, 5th year, MPP module (2004 – 2009), lectures on P2P systems (2-4h/year) and 4h of practical sessions/year.
University of Rennes 1, Operating Systems module (SYR), L3 level (2004 – 2008), lectures on networks (6h/year).