E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproducible Edge-to-Cloud Experiments

Gitlab: https://gitlab.inria.fr/Kerdata/Kerdata-Codes/e2clab

Why E2Clab?

Distributed digital infrastructures for computation and analytics are now evolving towards an interconnected ecosystem allowing to execute complex application workflows from IoT Edge devices to the HPC Cloud (aka the Computing Continuum, the Digital Continuum, or the Transcontinuum). Understanding end-to-end performance in such a complex continuum is challenging. This breaks down to conciliating many, typically contradicting application requirements and constraints with low-level infrastructure design choices. One important  challenge is to accurately reproduce relevant behaviors of a given application workflow and representative settings of the physical infrastructure underlying this complex continuum.

What is E2Clab?

E2Clab is a framework that implements a rigorous methodology that provides guidelines to move from real-life application workflows to representative settings of the physical infrastructure underlying this application in order to accurately reproduce its relevant behaviors and therefore understand end-to-end performance. Understanding end-to-end performance means rigorously mapping the scenario characteristics to the experimental environment, identifying and controlling the relevant configuration parameters of applications and system components, and defining the relevant performance metrics. Furthermore, this methodology leverages research quality aspects such as the Repeatability, Replicability, and Reproducibility of experiments through a well-defined experimentation methodology and providing transparent access to the experiment artifacts and experiment results. This is an important aspect that allows that the scientific claims are verifiable by others in order to build upon them.

E2Clab methodology and architecture

What E2Clab allows?

E2Clab allows researchers to reproduce in a representative way the application behavior in a controlled environment for extensive experiments and therefore to understand end-to-end performance of applications by correlating results to the parameter settings. E2Clab provides a rigorous approach to answering questions like: How to identify infrastructure bottlenecks? Which system parameters and infrastructure configurations impact on performance and how?

High-level features provided by E2Clab:

  • Leverage experiment Repeatability, Replicability, and Reproducibility
  • Configure the whole experimental environment (layers & services; network; and application workflow) in a descriptive manner
  • Map between application parts and machines on the Edge, Fog and Cloud
  • Scale and variate scenario deployments
  • Manage experiment deployment and execution on large-scale testbed (e.g. Grid’5000)
  • Backup metrics, log files, monitoring data, etc. generated during execution of experiments