Shadi Ibrahim

shadi

Tenured Inria Research Scientist (CR1)

KerData Team at IRISA and Inria Rennes – Bretagne Atlantique Research Center.

Contact details

Inria Rennes Bretagne – Atlantique
Campus Universitaire de Beaulieu, 35042, Rennes

Office : D186
Phone :  +33 (0) 2 99 84 25 34
Fax :  +33 (0) 2 99 84 71 71

Research Interests

My research interests include big data management, cloud computing, high-performance computing, data-intensive computing, virtualization, file and storage systems, and P2P computing.

Recent Highlights

  • Principal Investigator of the ANR JCJC project (KerStream), budget 238,000 EUR.
  • Member of Grid’5000 Sites Committee: Responsible for the Rennes site.
  • [Program Co-Chair] 2016: I am serving as a program co-chair of the ICA3PP 2017.
  • [Workshop Co-Chair] 2016: I am serving as a workshop co-chair of 1st Workshop on the Integration of Extreme Scale Computing and Big Data Management and Analytics  EBDMA 2017.
  • [Program Committee] 2016: I will be serving as a program committee of the Data Analytics, Visualization & Storage of SC 2017.
  • [Program Committee] 2016: I will be serving as a program committee of the Doctoral Showcase of SC 2017.
  • [Tutorial] 2016: Tutorial on reen Big Data Processing using Hadoop at the at the Europar 2016 conference, Grenoble, France (with Anne-Cécile Orgerie).
  • [Paper] 2016: Our paper on Addressing Performance Variability in Data Management for Post-Petascale Simulations is accepted in ACM Transactions on Parallel Processing.
  • [Paper] 2016: Our book chapter: A Taxonomy and Survey of Scientific Computing in the Cloud is accepted in Big Data: Principle and Paradigms, Wiley press, 2016.
  • [Paper] 2016: Our paper on On the energy footprint of I/O management in Exascale HPC systems is available online in FGCS 2016.
  • [Paper] 2016: Our paper on Enabling fast failure recovery in shared Hadoop clusters is available online in FGCS 2016.
  • [Paper] 2016: Our paper on the Root Causes of Cross-application I/O Interference in HPC Storage Systems is published in IPDPS 2016.
  • [Paper] 2016: Our paper on On the Usability of Shortest Remaining Time First Policy in Shared Hadoop Cluster is published in SAC 2016.
  • [Paper] 2015: Our paper on on grammar-based approach to spatial and temporal I/O patterns prediction is available online in TPDS 2015.
  • [Tutorial] 2015: Tutorial on Hadoop, at the IT4Innovations, Ostrava, Czech Republic.

Short bio

I am a research scientist within the KerData team at INRIA Rennes, working on MapReduce and Cloud Storage system. From 2007 to 2011, I was a PhD. student working with Prof. Hai Jin in the Services Computing Technology and System Laboratory (SCTS) at Huazhong University of Science & Technology (HUST).

My research is focused around Distributed Systems and Cloud Computing. My current research focuses on evaluating and improving reliability and consistency in Cloud systems. Previously my work centered around optimizing and improving MapReduce-based Cloud systems (Maestro, Disk Meta-Scheduler, CLOUDLET and LEEN). I also investigated the intra-machine cost fairness of Xen-Based Cloud Systems during my internship at Microsoft Research Center Asia (System Research Group).

I am also broadly interested in other research areas such as: I/O virtualization to provide QoS guarantees for applications while improving disk efficiency, and private torrent system measurement to understand the correlation between different incentive rules and their impact on the user behavior.

Research Experience

Publications

A list of my publications on Google Scholar here.

2016

  • [FGCS 2016] Orcun Yildiz, Shadi Ibrahim, Gabriel Antoniu, Enabling Fast Failure Recovery in Shared Hadoop Clusters: Towards Failure-Aware Scheduling”, In FGCS Journal.
  • [FGCS 2016] Matthieu Dorier, Orcun Yildiz, Shadi Ibrahim, Anne-Cecile Orgerie, Gabriel Antoniu, “On the Energy Footprint of I/O Management in Exascale HPC Systems, In FGCS Journal.
  • [IPDPS 2016] Orcun Yildiz, Matthieu Dorier, Shadi Ibrahim, Robert Ross, Gabriel Antoniu, On the Root Causes of Cross-application I/O Interference in HPC Storage Systems, In IPDPS 2016.
  • [SAC 2016] Nathanael Cheriere, Pierre Donat-Bouillud, Shadi Ibrahim, Matthieu Simonin, On the Usability of Short- est Remaining Time First Policy in Shared Hadoop Cluster, In the 31st ACM Symposium On Applied Computing ACM SAC 2016.
  • [ToPC 2016] Matthieu Dorier, Gabriel Antoniu, Franck Cappello, Marc Snir, Robert Sisneros, Orcun Yildiz, Shadi Ibrahim, Tom Peterka and Leigh Orf, Damaris: Addressing Performance Variability in Data Management for Post-Petascale Simulations, In ACM Transactions on Parallel Computing ACM ToPC Journal.
  • [Book Chapter] Amelie Chi Zhou, Bingsheng He, Shadi Ibrahim, A Taxonomy and Survey of Scientific Computing in the Cloud, Full chapter in Big Data: Principle and Paradigms, Wiley press, 2016.
  • [Book Chapter] Bunjamin Memishi, Shadi Ibrahim, Maria S. Perez, Gabriel Antoniu, Fault Tolerance in MapReduce: A Survey, Full chapter in Resource Management for Big Data Platforms, Springer press, 2016.
  • [Book Chapter] Bunjamin Memishi, Shadi Ibrahim, Maria S. Perez and Gabriel Antoniu, On the Dynamic Shifting of the MapReduce Timeout”, Full chapter in Handbook of Research on Managing and Processing Big Data in Cloud Computing, IGI-Global press, 2016.

2015

  • [FGCS 2015] Shadi Ibrahim, Tien-Dat Phan, Alexandra Carpen-Amarie, Houssem-Eddine Chihoub, Diana Moise, Gabriel Antoniu, Governing Energy Consumption in Hadoop through CPU Frequency Scaling: an Analysis. In FGCS Journal.
  • [TPDS 2015] Matthieu Dorier, Shadi Ibrahim, Gabriel Antoniu, Robert Ross, Using Formal Grammars to Predict I/O Behaviors in HPC: the Omnisc’IO Approach. In IEEE TPDS.
  • [CCPE 2015] Song Wu, Songqiao Tao, Xiao Ling, Hao Fan, Hai Jin, Shadi Ibrahim, iShare: Balancing I/O Performance Isolation and Disk I/O Efficiency in Virtualized Environments. In CCPE Journal.
  • [GreenCom 2015] Tien-Dat Phan, Shadi Ibrahim , Gabriel Antoniu, Luc Bouge, On Understanding the Energy Impact of Speculative Execution in Hadoop  To appear in , the 2015 IEEE International Conference on Green Computing and Communications (GreenCom 2015), Sydney, Australia, December 2015.
  • [GreenCom 2015] Frederic Desprez, Shadi Ibrahim, Adrien Lebre, Anne-Cecile Orgerie, Jonathan Pastor, Anthony Simone, Energy-Aware Massively Distributed Cloud Facilities: the DISCOVERY Initiative, Poster to appear in the 2015 IEEE International Conference on Green Computing and Communications (GreenCom 2015), Sydney, Australia, December 2015.
  • [BigData 2015] Orcun Yildiz, Shadi Ibrahim, Gabriel Antoniu, Chronos: Failure-Aware Scheduling in Shared Hadoop Clusters, In the 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, 2015. 
  • [SBAC-PAD 2015] Houssem-Eddine Chihoub, Shadi Ibrahim, Yue Li, Gabriel Antoniu, Maria Perez, Luc Bouge, Exploring Energy-Consistency Trade-offs in Cassandra Cloud Storage System, In the International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2015). 
  • [ARMS-CC 2015] Shadi Ibrahim, Tran Anh Phuong, Gabriel Antoniu, An Eye on the Elephant in the Wild: A Performance Evaluation of Hadoop’s Schedulers Under Failures, In Workshop on Adaptive Resource Management and Scheduling for Cloud Computing (ARMS-CC-2015), held in conjunction with PODC-2015 , July 2015.
  • [HPDC 2015] Amelie Chi Zhou, Bingsheng He, Shadi Ibrahim, Reynold C.K. Cheng, Performance and Monetary Cost Optimizations for Scientific Workflows in the Cloud: A Probabilistic Approach”, Poster in the ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC 2015), Portland, OR, USA, 2015.

2014

  • [TPDS 2014] Xiao Ling, Shadi Ibrahim, Hai Jin, Song Wu, Spatial Locality Aware Disk Scheduling in Virtualized Environment. To appear in TPDS 2014: Transactions on Parallel and Distributed Systems.
  • [SC 2014] Matthieu Dorier, Shadi Ibrahim, Gabriel Antoniu, Robert Ross, Omnisc’IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns Prediction. To appear in SC’14: International Conference for High Performance Computing, Networking, Storage and Analysis.
  • [ARMS-CC 2014] Shadi Ibrahim, Diana Moise, Houssem-Eddine Chihoub, Alexandra Carpen-Amarie, Luc Bougé and Gabriel Antoniu, Towards Efficient Power Management in MapReduce: Investigation of CPU-Frequencies Scaling on Power Efficiency in Hadoop. In the ARMS-CC workshop, which will be held in Paris, on July 15th, in conjunction with PODC-2014.
  • [DIDC 2014] Orcun Yildiz, Matthieu Dorier, Shadi Ibrahim, Gabriel Antoniu, A Performance and Energy Analysis of I/O Management Approaches for Exascale Systems, In the Sixth International Workshop on Data Intensive Distributed Computing (DIDC 2014) will be held in conjunction with the 23rd International Symposium on High Performance Distributed Computing (HPDC 2014), in Vancouver, Canada in June 23-27, 2014.
  • [IPDPS 2014] Matthieu Dorier, Gabriel Antoniu, Robert Ross, Dries Kimpe, Shadi Ibrahim, Calciom: Mitigating i/o interference in hpc systems through cross-application coordination, In IPDPS 2014, International Parallel and Distributed Processing Symposium.

2013

  • [IJPP] Hai Jin, Honglei Jiang, Shadi Ibrahim, Xiaofei Liao, Inaccuracy in Private BitTorrent Measurements. In the International Journal of Parallel Programming 2013.
  • [PPNA 2013] Shadi Ibrahim, Hai Jin, Lu Lu, Bingsheng He , Gabriel Antoniu, Song Wu, Handling Partitioning Skew in MapReduce using LEEN. In the PPNA Journal 2013.
  • [FGCS 2013] Hai Jin, Xiao Ling, Shadi Ibrahim, Wenzhi Cao, Song Wu, Gabriel Antoniu, Flubber: Two-level Disk Scheduling in Virtualized Environment. To appear in the FGCS Journal (CCGrid 2012 special issue).
  • [Book Chapter] Houssem-Eddine Chihoub, Shadi Ibrahim, Gabriel Antoniu, Maria S. Perez,Consistency Management in Cloud Storage Systems. Book Chapter in Advances in data processing techniques in the era of Big Data (Book Chapter Accepted)
  • [MASCOTS 2013] Xiao Ling, Shadi Ibrahim, Hai Jin, Song Wu, Songqiao Tao, Exploiting Spatial Locality to Improve Disk Efficiency in Virtualized Environments. The IEEE 21st International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems MASCOTS 2013, August 14-16, 2013 in San Francisco, CA (Mascots 2013).
  • [CCGrid2013] Houssem-Eddine Chihoub, Shadi Ibrahim, Gabriel Antoniu, Maria S. Perez, Consistency in the Cloud: When Money Does Matter!. The 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing CCGrid 2013, May 13-16, 2013, Delft, the Netherlands (CCGRID2013).

2012

  • [Cluster 2012] Houssem-Eddine Chihoub, Shadi Ibrahim, Gabriel Antoniu, Maria S. Perez, Harmony: Towards Automated Self-Adaptive Consistency in Cloud Storage. The IEEE International Conference on Cluster Computing Cluster 2012, Sep 24-28, 2012, Beijing, China (Cluster 2012).
  • [CCGrid 2012] Shadi Ibrahim, Hai Jin, Lu Lu , Bingsheng He, Gabriel Antoniu, Song Wu,Maestro: Replica-Aware Map Scheduling for MapReduce. The 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing CCGrid 2012, May 13-16, 2012, Ottawa, Canada (CCGRID2012).
  • [CCGrid 2012] Xiao Ling, Hai Jin, Shadi Ibrahim, Wenzhi Cao, Song Wu, Efficient Disk I/O Scheduling with QoS guarantees for Xen-based Platforms. The 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing CCGrid 2012, May 13-16, 2012, Ottawa, Canada (CCGRID2012).

2011

  • [JoSC] Haijun Cao, Hai Jin, Song Wu, Shadi Ibrahim, Petri Net based Grid Workflow Verification and Optimization. Journal of Supercomputing, Aug. 2011.
  • [Book Chapter] Hai Jin, Shadi Ibrahim, Li Qi, Haijun Cao, Song Wu, Xuanhua Shi, The MapReduce Programming Model and Implementations. Book Chapter in Cloud Computing: Principles and Paradigms, Wiley Press, 28 Mar 2011.
  • [ICPP 2011] Shadi Ibrahim, Hai Jin, Lu Lu , Bingsheng He, Song Wu, Adaptive I/O Scheduling for MapReduce in Virtualized Environment. The 40th Annual International Conference on Parallel Processing ICPP 2011, Sep 13-16, Taiwan (ICPP2011).
  • [SCC 2011] Shadi Ibrahim, Bingsheng He, Hai Jin, Towards Pay-As-You-Consume Cloud Computing,IEEE 8th International Conference on Services Computing SCC 2011, July 4-9, Washington, USA (SCC2011).

2010

  • [Book Chapter] Hai Jin, Shadi Ibrahim, Tim Bell, Wei Gao, Dachuan Huang, Song Wu, Cloud Types and Services. Book Chapter in in the Handbook of Cloud Computing, Springer Press, 26 Sep 2010.
  • [Book Chapter] Hai Jin, Shadi Ibrahim, Tim Bell, Li Qi, Haijun Cao, Song Wu, Xuanhua Shi, Tools and technologies for building the Clouds. Book Chapter in Cloud Computing: Principles Systems and Applications, Springer Press, 2 Aug 2010.
  • [CloudCom 2010] Shadi Ibrahim, Hai Jin, Lu Lu, Bingsheng He,Li Qi, Song Wu, LEEN: Locality/Fairness- aware key partitioning for MapReduce in the Cloud.2nd IEEE International Conference on Cloud Computing Technology and Science, November 30-December 3, 2010, Indiana, USA (Cloudcom2010).
  • [MapReduce 2010] Dachuan Huang, Xuanhua Shi, Shadi Ibrahim, Lu Lu, Song Wu, Hai Jin , MR-Scope: A Real Time Tracing Tool for MapReduce. The First International Workshop on MapReduce and its Applications (MAPREDUCE’10) in conjunction with (ACM HPDC 2010), June 22nd, 2010, Chicago, IL, USA.

2009

  • [CloudCom 2009] Shadi Ibrahim, Hai Jin, Lu Lu, Li Qi, Song Wu, Xuanhua Shi, Evaluating MapReduce on Virtual Machines: The Hadoop Case. 1st International conference on Cloud Computing (CloudCom2009), Dec 1-4, 2009, Beijing, China.
  • [HPDC 2009] Shadi Ibrahim, Hai Jin, Cheng bin, HaiJun Cao, Song Wu, Li Qi , CLOUDLET: Towards MapReduce Implementation on Virtual Machine. Poster Session, 18th International Symposium on High Performance Distributed Computing (HPDC-18), June 11-13, 2009, Munich, Germany.
  • [FITME 2009] Haijun Cao, Hai Jin, Song Wu, Shadi Ibrahim , ImageFlow: Workflow based Image Processing with Legacy Program in Grid. 2009 International Conference on Future Information Technology and Management Engineering, (FITME 2009), Sanya, China 13-14 Dec 2009.

2008

  • [ICTTA 08] Shadi Ibrahim, Hai Jin, Li Qi, Chunqiang Zeng, Grid Maintenance: Challenges and Existing Models. 3rd IEEE International Conference on Information & Communication Technologies: from Theory to Applications (ICTTA’08), 7-11 April 2008, Damascus, Syria.

Before 2008

  • [DCABES2007] Shadi Ibrahim, Qingping Guo, The Implementation of Course Discussion System Using JXTA. 6th International Conference on Distributed Computing and Applications for business, engineering and sciences (DCABES2007), 14-17 August 2007, YiChang, China,vol I, pp. 1000-1003.
  • [GCA 2005] Raihan Ur Rasool, Qingping Guo, Guo Yucheng,Zhou Zhen, Shadi Ibrahim , A Proposal of Next Generation Grid-Operating System. The 2005 International Conference on Grid Computing and Applications, GCA 2005, Las Vegas, Nevada, USA, June 20-23, 2005,pp. 90-98. 

Current Projects

I am currently involved (member) in several projects: 
  • (2015 – 2019) BigStorage: Storage-based Convergence between HPC and Cloud to handle Big Data. This is a Marie Curie Innovative Training Networks (ITN) project. My contribution: Investigating the energy impact of the most common Cloud and HPC usage scenarios while emphasizing on data management 
  • (2015 – 2019) DISCOVERY Inria Project Lab. My contribution: Investigating new VM deployment mechanisms that exploit locality and improve several aspects including reliability, performance, energy usage in a distributed infrastructure.
  • (2016-2018) JLESC:Joint Laboratory on Extreme-Scale Computing. This laboratory is jointly run by Inria, University of Illinois at Urbana Champaign (UIUC), Argonne National Laboratory (ANL) and Barcelona Supercomputing Center (BSC). My contribution: I am leading a sub-project on Resource Management and Scheduling for Data-Intensive HPC Workflows and working on other sub-projects with the focus on Optimizing and developing solutions to mitigate the I/O interference impacts on HPC application application performance in Next generation Exascale machines. 
  • (2016 – 2018) The DATA@EXASCALE 2 associate team (Ulta-scalable I/O and storage for Exascale systems). This is an associated team between the KerData team from INRIA Rennes – Bretagne Atlantique and Argonne National Laboratory (ANL). My contribution: Developing a novel scheduling framework and an I/O behavior predication model for HPC applications.

Talks

  • Harmony: Towards Automated Self-Adaptive Consistency in Cloud Storage Grid’5000 Winter School, Nantes, December 3-6, 2012. Received Best Presentation Award
  • Maestro: Replica-Aware Map Scheduling for MapReduce CCGrid 2012, Ottawa, Canada, May 2012.
  • Efficient Disk I/O Scheduling with QoS guarantees for Xen-based Platforms CCGrid 2012, Ottawa, Canada, May 2012.
  • Grid Maintenance: Challenges and Existing Models IEEE ICTTA 2008, Damascus, Syria, Apr 2008
  • Evaluating MapReduce on Virtual Machines: The Hadoop Case CloudCom2009, Beijing, China, Dec 2009.
  • Cloud Computing: an Overview CS, HuaZhong University of Science & Technology, Nov 2008.
  • Large-Scale Data Computing in the Cloud CS, HuaZhong University of Science & Technology, Feb 2009.

Responsibilities

  • Member of Grid’5000 Sites Committee: Responsible for the Rennes site.
  • Program Co-Chair for the 2017 ICA3PP conference, Helsinki, Finland, August, 2017
  • Program Co-Chair for the 2017 EBDMA Workshop, Madrid, Spain, May 2017;
  • PhD Consortium Co-Chair for the 2014 CloudCom conference, Singapore, December 2014.
  • Workshops Co-Chair for the 2014 ScalCom conference, Indonesia, December 2014.
  • Program Committee member: SC 2017, CCGrid 2017, IEEE Cluster 2016, CCGrid 2016, IEEE Cluster 2015, IEEE Cloudcom 2015, ICPADS 2015, IEEE CSE 2015, IEEE FCST 2015, IEEE ICA3PP 2015, IFIP NPC 2015, BigDataCloud 2015, MEDES 2015, SCRAMBL 2015, IEEE Cluster 2014, IEEE SCC 2014, IEEE Cloudcom 2014, ICPADS 2014, HPCC 2014, ICA3PP 2014, NPC 2014, MEDES 2014, ISPDC 2014, PICom-2014, SCRAMBL workshop 2014, CLOUD COMPUTING 2014, ICPP 2013, the CCGrid 2013 Doctoral Symposium, CloudCom 2013, NPC 2013, CLOUD COMPUTING 2013, EIDWT 2012.
  • Reviewer: IEEE Transactions on Parallel and Distributed Systems, IEEE Transactions on Cloud Computing, ACM Transactions on Internet Technology, Future Generation Computer Systems, IEEE Systems Journal, Journal of Supercomputing, Cluster Computing, Springer Transactions on Large- Scale Data and Knowledge-Centered Systems,MRI 2013-2014, Concurrency and Computation: Practice and Experience, The Computer Journal (Oxford), Computers and Electrical Engineering Journal, Multimedia Tools and Applications Journal, International Journal of Technology Marketing journal (IJTMKT), Journal of Engineering and Computer Innovations.