Gabriel Antoniu

Senior Research Scientist, Inria

Scientific leader of the KerData research team at INRIA Rennes – Bretagne Atlantique Research Center and IRISA.

Contact details

Inria Rennes Bretagne – Atlantique
Campus Universitaire de Beaulieu, 35042, Rennes
Office: F427 (red level)
Phone: +33 (0)2 99 84 72 44
Fax :  +33 (0)2 99 84 71 71

Research Interests

My main current research interests are related to Big Data management for large scale distributed infrastructures: clouds, post-petascale HPC systems.

Relevant topics:

  • Decentralized management of massive data on highly distributed infrastructures;
  • Cloud data services and MapReduce-based data analytics;
  • Scalable I/O and in situ visualization for Exascale HPC systems
  • Scalable transparent data storage and sharing;
  • Scalable distributed file systems;
  • BLOB-based data management;
  • Data consistency protocols.

Recent Highlights

Recent invited talks (selection)

  • BDEC: Big Data and Extreme-Scale Computing: a Storage-Based Pathway to Convergence. Invited keynote talk at the 4th Big Data and Exascale Computing (BDEC) workshop, Frankfurt, June 2016. BDEC is a working group exploring the convergence of Big Data and Extreme Computing areas.
  • First Chinese-French Workshop on Extreme Computing: Damaris: Jitter-Free I/O Management and In Situ Visualization of HPC Simulations using Dedicated Cores, Guangzhou, May 2016.
  • 5th JLESC workshop: Spark versus Flink: Understanding Performance in Big Data Analytics Frameworks, Lyon, June 2016.
  • Inria/CIC-IPN workshop: Scalable Big Data Processing on Clouds: A-Brain and Z- CloudFlow, Centro de Investigación y Computación, Instituto Politécnico Nacional, Mexico City, November 2016.
  • Inria/Technicolor workshop on big data and analytics: Spark versus Flink: Understanding Performance in Big Data Analytics Frameworks, Rennes, November 2016.
  • 6th JLESC workshop: Storage-Based Convergence Between HPC and Big Data, Kobe, December 2016.

Short bio

Gabriel Antoniu is a Senior Research Scientist at Inria, Rennes. He leads the KerData research team, focusing on storage and I/O management for Big Data processing on scalable infrastructures (clouds, HPC systems). He received his Ph.D. degree in Computer Science in 2001 from ENS Lyon. He leads several international projects in partnership with Microsoft Research, IBM, Argonne National Lab, the University of Illinois at Urbana Champaign. He served as Program Chair for the IEEE Cluster 2014 conference and regularly serves as a PC member of major conferences in the area of HPC, cloud computing and Big Data (SC, HPDC, CCGRID, Cluster, Big Data, etc.). He has acted as advisor for 18 PhD theses and has co-authored over 120 international publications.

Publications

A list of publications can be found on DBLP.

A comprehensive list of my publications can be found on HAL Open Archives Library:

Software

  • BlobSeer is a data management platform we are currently developing for sharing massive data at very large scales. It originally relies on advanced techniques for decentralized data management and versioning techniques to provide scalable data throughput under heavy data access concurrency.
  • Damaris: Damaris is a middleware for multicore SMP nodes allowing them to efficiently handle data transfers for storage and visualization by dedicating one or a few cores to the application I/O or for in situ visualization. It has been developed within the framework of a collaboration with the Joint Laboratory for Extreme-Scale Computing (JLESC, ex-JLPC). It was successfully evaluated with the CM1 tornado simulation, one of the Blue Waters target applications, on several supercomputers (Titan, Jaguar, Kraken), where it demonstrated excellent scalability.
  • JuxMem: is a platform which illustrates the concept of Grid Data-Sharing Service, defined using a hybrid approach based on Distributed Shared Memory and Peer-to-Peer techniques.

Leadership roles in the scientific community and in ongoing projects

  • Program Chair of the IEEE Cluster 2017 conference
  • Program Vice-Chair of the ACM/IEEE CCGrid 2016 and ACM/IEEE CCGrid 2017 conference for the mobile and hybrid clouds area.
  • Z-CloudFlow (2013-2016): geographically distributed workflows on Azure clouds. Role: Principal Investigator (co-PI: Patrick-Valduriez). A project of the Microsoft-Inria Joint Research Centre.
  • The Data@Exascale Associate Team (2013-2018) with Argonne National Lab and the University of Illinois at Urbana-Champaign. Role: Principal Investigator. This Associate Team was born in the framework of the Joint Inria-Illinois-ANL-BSC Laboratory on Extreme Scale Computing.
  • BigStorage (2015-2018) is an European Training Network (ETN) project. Area: Storage-based Convergence of HPC and Cloud infrastructures to handle Big Data. Role: local coordinator for Inria Rennes Bretagne Atlantique. Roles: Work Package leader (Data Science WP), Partner scientific contact (for Inria).
  • JLESC (involved since 2009)Joint Inria- Illinois- ANL-BSC-JSC-RIKEN/AICS Laboratory for Extreme-Scale Computing. Current role:  Topic leader for Inria for the data storage, I/O and in situ processing topic, coordinating collaboration activities among the lab partners in these areas.
  • DAMARIS (2016-2018) is an ADT project funded by Inria (Action de Développement Technologique), whose goal is to transform  Damaris into production-level software and to develop its user community. Role: Project Leader.

Previous leadership roles

Community
International Projects
  • A-Brain (2010-2013). A project dedicated to joint neuroimaging and genetics analysis on Microsoft’s Azure cloud computing platform. Role: Principal Investigator, with Bertrand Thirion (PARIETAL team, Inria). A project of the Microsoft-Inria Joint Research Centre.
  • MapReduce: an ANR Project (2010-2014) with International partners on optimized MapReduce data processing on cloud platforms: Argonne National Lab (USA), University of Illinois at Urbana-Champaign, IBM France, the Joint Inria-ANL-UIUC-BSC Lab for Petascale Computing (ex-JLPC), the AVALON Inria team, IBCP and MEDIT. Role: Principal Investigator.
  • Seeding a France-Chicago Collaboration in Exascale Storage for Computational Science : (2012) FACCTS joint project with Argonne National Lab(ANL). Role: project co-Principal Investigator, with Rob Ross (ANL).
  • F3PC: ANR-JST project (2010-2014). Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
  • The SCALUS Marie Curie Initial Training Network, call FP7-PEOPLE-ITN-2008 (2009-2013). Role : coordinator for Inria (teams involved: KerData, Myriads). Other partners: Universidad Politécnica de Madrid, Barcelona Supercomputing Center, University of Paderborn, Ruprecht-Karls-Universität Heidelberg, Durham University, FORTH, Ecole des Mines de Nantes, XLAB, CERN, NEC, Microsoft Research, Fujitsu.
  • DataCloud@work (2010+2012): an Inria Associate Team with the University “Politehnica” of Bucharest (PUB), Romania (Valentin Cristea). Role: Principal Investigator.
  • Projects with Tsukuba University, Japan (Osamu Tatebe, Gfarm team):
    • Bilateral PHC (ex-PAI) Sakura project (INRIA – AIST/University of Tsukuba, 2006-2007) on P2P-based data sharing. Role: Principal Investigator.
    • NEGST (2006 – 2009): CNRS-JST project. Role: participant.
  • Bilateral project with the University of Illinois at Urbana Champaign, USA (CNRS-INRIA-UIUC programme, 2006-2007). Role: Principal Investigator.
  • GridRand: bilateral PHC Brancusi project with the Technical University of Cluj-Napoca, Romania (2009-2010). Role: Principal Investigator.
  • GridDataViz: bilateral project with “Politehnica” University of Bucharest (CNRS – Romanian Academy of Science, 2008-2009). Topic: visualization and remote control of the BlobSeer data management platform using the MonALISA monitoring framework. Role: Principal Investigator.
National Projects

The Grid Data-Sharing Service approach I have worked on between 2004 and 2008 has been at the center of the GDS project of the French ACI MD (2003 – 2006) and has been enhanced and validated within the LEGO and RESPIRE ANR projects (2006-2009).

  • GDS (2003 – 2006): ACI MD project. Role: Principal Investigator.
  • RESPIRE (2006 – 2008): ANR project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
  • LEGO (2006 – 2009): ANR project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.
  • GdX (2003 – 2006): ACI-MD project. Role: local coordinator for INRIA Rennes – Bretagne Atlantique.

Recent conference Program Committees (non-Chair PC member)

ACM HPDC 2012-2017 (CORE A), ICDE 2017 (CORE A+), IEEE Cluster 2008, 2016 (CORE A), ACM/IEEE SC (2013, 2015, 2017: Papers Committee; 2014: Posters Committee – CORE A), IEEE/ACM CCGRID 2013 and 2015 (CORE A), Euro-Par 2008, 2012, and 2015 (CORE A), IEEE HPCC 2012 (CORE B), IEEE AINA 2011 and 2012 (CORE B),  IEEE ICPADS 2010 (CORE B).

Advised PhD students

Note: the % represents my contribution to advising. I also serve as a PhD director (directeur de thèse) – for all PhD students listed below, except for Tien Dat Phan.

  • Nathanaël Cheriere: ENS Rennes, 2016–. Co-advised (30%) with Shadi Ibrahim (70%). Subject: resource management and scheduling for Big Data applications in large-scale systems.
  • Ovidiu Cristian Marcu: Inria/H2020 BigStorage project, INSA Rennes, 2015–. Co-advised (25%) with Alexandru Costan (25%), María Pérez (25%) and Jesús Montés (25%). Subject: Efficient data processing and streaming strategies for workflow-based Big Data processing.
  • Pierre Matri: Universidad Politécnica de Madrid/H2020 BigStorage project, 2015–. Co-advised (30%) with María Pérez (70%). Subject: storage for converged HPC-Big Data systems.
  • Mohamed Yacine Taleb: Inria/H2020 BigStorage project, ENS Rennes, 2015–. Co-advised (40%) with Shadi Ibrahim (30%) and Toni Cortés (30%). Subject: energy-impact of data consistency management.
  • Tien Dat Phan: MESR Grant, ENS Rennes, 2014–. Co-advised (30%) with Luc Bougé (40%) and Shadi Ibrahim (30%). Subject: green Big Data processing in Clouds.
  • Orçun Yildiz: INRIA CORDI-S grant, ENS Rennes, 2014– . Co-advised (50%) with Shadi Ibrahim (50%). Subject: energy-efficient Big Data management in HPC systems.
  • Luis Eduardo Pineda Morales: Microsoft Research Inria Joint Centre, INSA Rennes, 2013–. Co-advised (50%) with Alexandru Costan (50%). Subject: data management for distributed cloud workflows.
  • Lokman Rahmani: MESR Grant, ENS Rennes, 2013–. Co-advised (50%) with Luc Bougé (50%). Subject: smart in situ visualization.

Former PhD students

  • Matthieu Dorier (2011–2014). Co-advised (70%) with Luc Bougé (30%). Subject: I/O Variability in Post-Petascale HPC Simulations. Accessit to the 2015 Gilles Kahn Honorary PhD Thesis Award of the SIF and the Academy of Science (second prize). Now a Postdoctoral Appointee at Argonne National Lab, USA.
  • Radu Tudoran (2011–2014). Co-advised (70%) with Luc Bougé (30%). Subject: Big Data management across cloud dataceners. Now a Senior Research Engineer at Huawei Technologies, Munich, Germany.
  • Bunjamin Memishi (2011–2015). Co-advised (30%) with María Pérez (70%). Subject: reliable MapReduce processing.
  • Houssem-Eddine Chihoub (2010–2013). Co-advised (70%) with María Pérez (30%). Subject: data consistency in the cloud. Now a Postdoctoral Researcher at Institut Polytechnique de Grenoble, France.
  • Viet-Trung Tran (2009–2012). Co-advised (70%) with Luc Bougé (30%). Subject: storage for HPC systems. Now a Lecturer at the School of Information and Communication Technology, Hanoi University of Science and Technology, Vietnam.
  • Alexandra Carpen-Amarie (2008 – 2011). Co-advised (70%) with Luc Bougé (30%). Subject: using the BlobSeer approach for self-adaptive cloud data management. Now a Postdoctoral researcher at TU Wien, Vienna, Austria.
  • Diana Moise (2008 – 2011). Co-advised (70%) avec Luc Bougé (30%). Subject : using the BlobSeer approach for efficient MapReduce processing. Now a Big Data Application Analyst at Cray, Inc., Zürich, Switzerland.
  • Bogdan Nicolae (2007 – 2010) Second Gilles Kahn/SPECIF PhD Thesis Award in 2011. Co-advised (70%) with Luc Bougé (30%). Subject: the BlobSeer approach to large-scale data management for data-intensive applications. After 18 months as a postdoc at UIUC, Bogdan was hired as a Research Scientist at IBM Research Dublin in 2012. He recently joined Huawei Technologies as a Principal Research Scientist (Munich, Germany).
  • Loïc Cudennec (2005 – 2009). Co-advised (70%) with Luc Bougé (30%). Subject: grid application deployment. Now a Research engineer at CEA (LIST lab), Saclay, France.
  • Sébastien Monnet (2003 – 2006). Co-advised (50%) avec Luc Bougé (50%). Sébastien was hired in 2007 as an Associate Professor (MdC) at Université Pierre et Marie Curie, Paris, a member the REGAL team  (LIP6 – INRIA Rocquencourt). He is now a Professor at Université Savoie Mont Blanc (Polytech’ Annecy-Chambéry/LISTIC), Annecy, France.
  • Mathieu Jan (2003 – 2006). Co-advised (50%) with Luc Bougé (50%). Subject: grid application deployment. Now a Research engineer at CEA (LIST lab), Saclay, France.

Teaching (since 2004)

  • ENSAI – responsible of the Big Data magagement course of the Statistics and Data Science track – lectures and practical sessions (24h/year since 2013).
  • ENSAI – co-responsible for two courses (Cloud Computing and for the Hadoop Technologies) for MSc in Statistics for Smart Data – lectures and practical sessions (15h/year since 2015).
  • EIT ICT Labs Master School, University of Rennes 1, SDS module, (10h/year since 2014).
  • University of Nantes, ALMA Master, Distributed Architectures module – AD (since 2009), lectures on grid, P2P and cloud data management (8-10h/year) and few hours of supervised work (TD) and practical sessions (TP) per year.
  • ENS Cachan – Antenne de Bretagne, Master M2RI, SDS (ex-PAP) module (2004-2015), lectures on P2P systems (5-10h/year).
  • Ecole Supérieure d’Informatique, Electronique, Automatique, 5th year, full grid and cloud computing module (2009-2013), lectures (18h/year).
  • CEA-EDF-INRIA School (2009) on emerging grid middleware standards: lectures on grid data management (9h).
  • INSA de Rennes, CS Department, 5th year, MPP module (2004 – 2009), lectures on P2P systems (2-4h/year) and 4h of practical sessions/year.
  • University of Rennes 1, Operating Systems module (SYR), L3 level (2004 – 2008), lectures on networks (6h/year).