datamove-tt talks

We organize talks on a regular basis on a large variety of topics. Often attending these talks is on invitation only, but we are sharing our program with everyone here. When available, public materials will be referenced on this page.

– Contact. millian.poquet@inria.fr
– Calendar. https://zimbra.inria.fr/home/bruno.raffin@inria.fr/datamove-tt
– Mailing list. https://listes.univ-grenoble-alpes.fr/sympa/info/lig-datamove-tt

2022-07-13 Data visualization and analysis in R

Presenter. Lucas Leandro Nesi

Slides. NYA

Abstract. Presenting experimental results is a ubiquitous step in research. Proper presentation is essential for understanding and reproducibility of reported results. However, this step is often underestimated, being conducted in an ad-hoc manner, using inappropriate tools and with little traceability, such as spreadsheet editors. This presentation will quickly introduce an approach based on the R language ranging from raw data processing to visual representation. The presented methodology exploits the dplyr, tidyr, readr, patchwork, ggplot2 and plotly packages to allow all transformations and manipulations applied to the data to be documented and uncomplicated. The content includes a brief introduction to the R language and its environment, raw data processing using tidyverse, and data visualization with ggplot.

2022-06-10 Towards Reproducible Deep Learning Experiments

Presenter. Lucas Meyer

Slides. NYA

Abstract. The prolific deep learning literature involve many experiments, which the scientific community must be able to replicate. This replication constraint is far from being easily met. The training of deep learning architectures require the tuning of many hyper parameters. To track which combination of those parameters yield the best performances is not trivial. Moreover deep learning trends point towards bigger dataset and bigger architectures. It is thus common to use high performance computing resources for training. How can make deep learning experiments easier to replicate in such settings? This practical session aims to present a couple of recent frameworks, namely pytroch-lightning, weight-and-biases and hydra, used to ease the replication deep learning experiments.

2022-05-20 Reconciling high-performance computing with the use of third-party libraries?

Presenter. Marek Felšöci & Emmanuel Agullo

Slides. 220520-slides-emmanuel-agullo-marek-felsoci.pdf

Abstract. High performance computing often requires relying on multiple software packages and that these be optimized on the target machine where these computations run. The optimization constraints are such that it is commonly accepted that the deployment of this software can only be done manually or by relying on the work of the target machine’s administrators (typically via a load module). However, the amount of work involved when doing it yourself or the unavailability of precisely the required functionality when relying on modules provided by administrators, means that many codes choose to provide themselves some functionality that could be obtained via third party libraries, contrary to the canons of software engineering.

In this talk, we will first review a quest (CMake, Spack, and now Guix) for an environment to reliably deploy high-performance computing software in a portable, high-performance, and reproducible manner so that the use of third-party libraries is no longer a concern. We then dedicate the remaining part of the presentation to a concrete example of a research study conducted within an ongoing PhD thesis combining Guix [1] to ensure the reproducibility of the software environment with Org mode [2] in an attempt to maintain an exhaustive, clear and accessible description [3] of the experiments, the source code and procedures involved in the construction of the experimental software environment, the execution of benchmarks as well as the gathering and the post-processing of the results.

[1] GNU Guix software distribution and transactional package manager (https://guix.gnu.org).
[2] The Org Mode 9.1 Reference Manual, Dominik Carsten, 2018.
[3] Literate Programming, Donald E. Knuth, 1984 (https://doi.org/10.1093/comjnl/27.2.97).

2022-04-29 Making MPI dynamic: Dynamic resource management with MPI sessions

Presenter. Martin Schreiber

Slides. NYA

Abstract. Current schedulers for supercomputers assume a fixed number of resources for an application parallelized with MPI. Allowing resources to vary during the execution of an application could obviously lead to significant improvements in application throughput, energy, urgent computing, and also lead to new solver strategies, to name just a few. However, the path to providing such flexibility in the right way is extremely challenging as it involves additional support in the various components of the supercomputer stack.

This talk will focus primarily on the MPI part of this problem, namely the development of a robust, flexible, and future-proof interface for such dynamic resources for MPI-based applications.

Based on a joint work with Jan Fecht, Maximilian Streubel, Dan Holmes, Martin Schulz, Howard Pritchard

2022-04-15 ResPCT: Fast Checkpointing in Non-volatile Memory for Multi-threaded Applications

Presenter. Ana Khorguani

Slides. 220415-slides-ana-khorguani.pdf

Abstract. The advent of non-volatile memory (NVMM) technologies is a great opportunity to build fast fault-tolerant programs as it provides persistent storage that can be used as main memory. However, since the caches of processors remain volatile, solutions need to be designed to recover a consistent state from NVMM after a crash. This paper presents ResPCT, a checkpointing solution to make multi-threaded programs fault tolerant by flushing persistent data structures to NVMM periodically.
ResPCT relies on a new solution based on In-Cache-Line logging to efficiently track modifications during failure-free execution and restore a consistent state after a crash. ResPCT provides an API that allows programmers to locate potential restart points in their program, which simplifies the identification of the persistent program state and can also lead to performance benefits. Experiments run on real hardware with representative benchmarks and applications show that ResPCT can outperform all state-of-the-art solutions by up to 2.7x, and that its overhead can be as low as 4% at large core count.

2022-04-08 Characterization of different user behaviors for demand response in data centers

Presenter. Maël Madon

Slides. 220407-slides-mael-madon.pdf

Abstract. Digital technologies are becoming ubiquitous while their impact increases. A growing part of this impact happens far away from the end users, in networks or data centers, contributing to a rebound effect. A solution for a more responsible use is therefore to involve the user. As a first step in this quest, this work considers the users of a data center and characterizes their contribution to curtail the computing load for a short period of time by solely changing their job submission behavior.
The contributions are: (i) an open-source plugin for the simulator Batsim to simulate users based on real data; (ii) the exploration of four types of user behaviors to curtail the load during a time window namely delaying, degrading, reconfiguring or renouncing to their job submissions. We study the impact of these behaviors on four different metrics: the energy consumed during and after the time window, the mean waiting time and the mean slowdown. We also characterize the conditions under which the involvement of users is the most beneficial.

2022-04-01 Scheduling with Component Health Index: Complexity Study

Presenter. Ernest Foussard

Slides. 220401-slides-ernest-foussard.pdf

Abstract. Planning maintenance is fundamental in the management of industrial equipment. Two types of maintenance can be identified. Corrective maintenance is scheduled right after breakdown to repair failed equipment and is usually very expensive. By contrast, preventive maintenance is scheduled upstream to prevent a potential failure. Efficient preventive maintenance policies allow preserving continuity and level of service, while increasing the durability of the equipment. Due to the recent technological advances with captors, it is now possible to precisely monitor the wear of the components and derive more efficient maintenance policies.

In this talk, a joint production/maintenance scheduling problem on a single machine with multiple components is presented. Each component has a health index which is consumed as production jobs are processed. When the health index is too low, a maintenance operation may be required to process more jobs. An overview of the complexity of the problem and results on approximation algorithms will be presented for classical scheduling objective functions.

2022-03-25 Rust for NVRAM: Fine-Grained Access

Presenter. Louis Boulanger

Slides. 220325-slides-louis-boulanger.pdf

Abstract. Non-Volatile Memory (NVM, NVRAM) is a fairly new level in the memory hierarchy, exhibiting the persistent property of other storage mediums (SSDs, HDDs) while boasting access speed similar to those of RAM, as well as a byte-wise access. This double nature introduces specific challenges to overcome, especially when NVRAM is used as this persistent RAM for storing applications’ objects. Rust is a systems programming language, that allow for programming at a fairly low level (comparable to C), while providing higher-level constructs and guarantees, one of which is “immutability by default”. We think that the properties of Rust make it the best choice to build reliable and secure NVRAM applications, and we present a data structure, the MCell, that allows for flexible compile-time verification of borrowing rules while allowing users to mutate only a part of its contents, which improves performance in the context of NVRAM.

2022-03-18 Hands-on SILECS/Grid’5000

Presenter. Pierre Neyron

2022-03-02 Evaluation of simulation models for Batsim

Presenter. Adrien Faure

Slides. 220302-slides-adrien-faure.pdf

Abstract. Batsim is a platform simulator that aims to simulate a distributed
computing platform with its application, incorporating visible effects
such as network contentions. The simulation of distributed application
remains challenging, especially in a context when we need to simulate
hundred, or thousands of applications at the same time.
Based on top of Simgrid, Batsim benefits from a scalable network model
and a way to express applications.

Among the available model in SimGrid, one, in particular, looks promising to
simulate distributed application, as it shows a good trade-off between
performances and realism, however it has never been validated. I will
present the work done in my Ph.D. along with what I did since in the
Datamove team. In particular, how we invalidated our best model candidate.

2022-02-23 SIM-SITU : A Framework for the Faithful Simulation of in-situ Workflows

Presenter. Valentin Honoré

Slides. 220223-slides-valentin-honore.pdf

Abstract. The amount of data generated by numerical simulations in various scientific domains such as molecular dynamics, climate modeling, biology, or astrophysics, led to a fundamental redesign of application workflows. The throughput and the capacity of storage subsystems have not evolved as fast as the computing power in extreme-scale supercomputers. As a result, the classical post-hoc analysis of simulation outputs became highly inefficient. In-situ workflows have then emerged as a solution in which simulation and data analytics are intertwined through shared computing resources, thus lower latencies. Determining the best allocation, i.e., how many resources to allocate to each component of an in-situ workflow; and mapping, i.e., where and at which frequency to run the data analytics component, is a complex task whose performance assessment is crucial to the efficient execution of in-situ workflows. However, such a performance evaluation of different allocation and mapping strategies usually relies either on directly running them on the targeted execution environments, which can rapidly become extremely time- and resource-consuming, or on resorting to the simulation of simplified models of the components of an in-situ workflow, which can lack of realism. In both cases, the validity of the performance evaluation is limited. To address this issue, we introduce SIM-SITU, a framework for the faithful simulation of in-situ workflows. This framework builds on the SimGrid toolkit and benefits of several important features of this versatile simulation tool. We designed SIM-SITU to reflect the typical structure of in-situ workflows and thanks to its modular design, SIM-SITU has the necessary flexibility to easily and faithfully evaluate the behavior and performance of various allocation and mapping strategies for in-situ workflows. We illustrate the simulation capabilities of SIM-SITU on a Molecular Dynamics use case. We study the impact of different allocation and mapping strategies on performance and show how users can leverage SIM-SITU to determine interesting tradeoffs when designing their in-situ workflow.

2022-02-16 Affinity-Aware Capacity Planning for Scheduling Long-Running Applications in Shared Clusters

Presenter. Clément Mommessin

Slides. 220216-slides-clement-mommessin.pdf

Abstract. A significant portion of production clusters is nowdedicated to long-running applications (LRAs), which are executed in the order of hours or even months. LRAs typically have i) hard co-location affinity constraints and ii) time-varying resource requirements to a shared cluster. Existing work of LRAscheduling is often application-agnostic, without particularly addressing these requirements. In this paper we present anaffinity-aware capacity planning approach that minimizes the number of required compute nodes in a shared cluster whilst satisfying affinity and varying resource restrictions. We formulatethe planning as an ILP problem, investigating a variety of algorithms and provisioning a suite of application-centric, node-centric, and application-node matching methods that are well suited for large-scale LRA deployment. Experiments driven by the Alibaba Tianchi dataset show our algorithms achieve competitive scheduling effectiveness and running time compared with the heuristics used by Medea and LRASched on instances with fixed resource requirement, and a family of algorithms using binary search showing significant effectiveness improvement at the cost of increased running time.

2022-02-09 Enseigner le Cloud

Presenter. Christophe Cérin

Slides. 220209-slides-christophe-cerin.pdf

Abstract. The talk includes an introduction of several Cloud technologies (AWS, OpenStack) and teaching material to make students gain hands-on experience with them. The first axis aims at making students apply on Cloud infrastructures what they already know (e.g., run a database benchmark, deploy a web software stack…). How to manage resources programmatically will be shown. The second axis is a project-oriented approach where students are in charge of cloudifying toy applications so that they can be deployed on Cloud infrastructures.

2022-02-02 How can we estimate the energy consumption of a learning algorithm?

Presenter. Mathilde Jay

Slides. 220202-slides-mathilde-jay.pdf

Abstract. How can we estimate the energy consumption of a learning algorithm?
The usage of ICTs and in particular AI algorithms is growing and so is their environmental impact. According to the shift project (march 2021), digital technologies emitted in 2019 3.5% of greenhouse gas emissions, that is to say more than civil aviation. Its impact grows 6% by year.
Better understanding the environmental impacts of AI is a necessary step towards reducing them.
I will present existing technologies which measure the energy consumption of ICTs: RAPL files, several monitoring softwares like PowerAPI and Scaphandre, Python libraries, online tools and, of course, wattmeters.
I will give an overview of those technologies and will compare a selection of them on a simple AI use case using grid5000.

2021-12-08 The Dual Fitting Method applied to Two-Agent Scheduling (proof for competitive-algorithm)

Presenter. Vincent Fagnon

Slides. 211208-slides-vincent-fagnon.pdf

Abstract. In several situations in an IoT environment, sensors produce data that should be analyzed locally along with the execution of external tasks submitted by a superior authority.
Here, we model such a system as a problem of scheduling two sets of tasks, each with its own objective: 1. a released on-line set of tasks that can be preempted with minimization of mean flow-time objective and 2. an off-line set of tasks that cannot be preempted and have to be executed before a common deadline.
Taken separately, these two problems are trivial. SRPT is optimal on the first set, and any list algorithm respects the deadline for the second set.
In this talk we will show that any algorithm has an arbitrarily bad competitive ratio, even if we use resource augmentation (speed increase and rejection) to cope with what characterizes the pathological cases that make the problem structurally hard.
The focus will be on the LP model we propose, and particularly on the method that we use (Dual Fitting) to give an upper bound of our algorithm’s competitive-ratio.
Then there will be a time of reflection / presentation of conjectures and the avenues we consider to improve our results (an “exploratory” part where you are greatly encouraged to participate and provide new ideas : LP modification & relaxations, Cutting-plane method / Gomory cuts, Lagrangian relaxation…)

2021-12-01 An introduction to Linux kernel programming with eBPF

Presenter. Baptiste Jonglez

Slides. 211201-slides-baptiste-jonglez.pdf

Abstract. Have you ever dreamed of becoming a Linux kernel expert? With eBPF, you (almost) no longer need to be an expert to program the kernel! I will give a high-level overview of eBPF, focusing first on its peculiar programming model. I will then detail a few applications in system visibility and high-speed network dataplane processing.

2021-11-24 Decentralized Meta Frank Wolfe for Online Neural Network Optimization

Presenter. Tuan Anh Nguyen

Slides. 211114-slides-tuan-anh-nguyen.pdf

Abstract. The design of decentralized learning algorithms is important in the fast-growing world in which data are distributed over participants with limited local computation resources and communication. In this direction, we propose an online algorithm minimizing non-convex loss functions aggregated from individual data/models distributed over a network. We provide the theoretical performance guarantee of our algorithm and demonstrate its utility on a real life smart building.

2021-11-17 Backfilling scheduling with job partitioning

Presenter. Danilo Carastan dos Santos

Slides. 211117-slides-danilo-carastan-dos-santos.pdf

Abstract. In backfilling HPC scheduling. Large/long jobs can “stuck” the platform. Stuck jobs will need to wait for the large/long job to finish, resulting in many jobs (especially small/short ones) waiting for a long time, drastically degrading the scheduling metrics (waiting time, flow, time, or slowdown). A typical solution to this problem is to reserve a portion of the platform exclusively for small/short jobs. This strategy assures that “small/short jobs keep flowing” in the platform, but the “flow rate” is bound by the size of the portion reserved for small/short jobs. This talk aims to discuss and collect feedback about ways to use the whole platform to keep jobs flowing. I will present a strategy of job partitioning, which consists of partitioning large/long jobs into many partitions during job submission. The scheduler submits a partition of a job in the waiting queue only after completing precedent partitions. Expected consequences of job partitioning are: (i) it will allow “breathing windows” in the scheduling, giving room to “unstuck” the platform, (ii) large/long jobs will wait longer. Still, scheduling small/short jobs in these “breathing windows” outweigh the increase in the waiting time of large/long jobs, and (iii), it alleviates the effects of runtime estimates in backfilling scheduling since all partitions of a job except for the last one can have inaccurate processing times.

2021-10-20 Elastic Large Scale Ensemble Data Assimilation with Particle Filters for Continental Weather Simulation

Presenter. Sebastian Friedemann

Slides. 211020-slides-sebastian-friedemann.pdf

Abstract. Particle filters are a major tool used for data assimilation (DA) in climate modeling. The ability to handle a very large number of particles, is critical for high dimensional climate models. The presented approach introduces a novel way of running such DA studies for the example of a particle filter using sequential important resampling (SIR). The new approach executes efficient on latest high performance computing platforms. It is resilient to numerical and hardware faults while minimizing data movements and enabling dynamic load balancing. Multiple particle propagations are performed per running simulation instance, i.e., runner, each assimilation cycle. Particle weights are computed locally on each of these runners and transmitted to a central server that normalizes them, resamples new particles based on their weight, and redistributes the work to runners one by one to react to load imbalance. Our approach leverages the multi-level checkpointing library FTI, permitting particles to move from one runner to another in the background while particle propagation goes on. This also enables the number of runners to vary during the execution either in reaction to failures and restarts, or to adapt to changing resource availability dictated by external decision processes. The approach is experimented with the Weather Research and Forecasting (WRF) model, to assess its performance for probabilistic weather forecasting. Multiple thousand particles on more than 20000 compute cores are used to assimilate cloud cover observations into short-range weather forecasts over Europe.

2021-09-29 Les bases mathématiques de l’harmonie musicale

Presenter. Denis Trystram

Slides. 210929-slides-denis-trystram.pdf

Abstract. On chante sous la douche, on siffle en marchant dans la rue, certains jouent d’un instrument, d’autres font partie d’un music band… La musique est partout ! Mais combien d’entre nous connaissent vraiment les bases de l’harmonie musicale ? Dans cet exposé, nous présenterons la formation des gammes et leur évolution dans le temps. Ce sera l’occasion de croiser de grands mathématiciens comme Pythagore, Mersenne ou Euler… Pas besoin d’avoir des connaissances musicales ou mathématiques pour suivre ce voyage dans l’univers des notes et des nombres rationnels.

2021-09-22 Non-intrusive harvesting of grid idle resources: a control-based approach.

Presenter. Quentin Guilloteau

Slides. 210922-slides-quentin-guilloteau.pdf

Abstract. High Performance Computing systems are facing more and more variability in their performance (e.g., in Input/Output (I/O) and power consumption).
This variability makes them less predictable, which requires more run-time management to meet the requirements.

This can be addressed following a feedback approach, where a management feedback loop,
in response to monitored information in the system, based on analysis of this data,
decides to activate system-level or application-level adaptation mechanisms.

One such regulation problem is found in the context of CiGri,
a lightweight computing grid system which exploits the unused resources of a set of computing clusters.
The computing power left over by the execution of premium cluster users’ HPC applications,
is used to execute smaller jobs, which are injected as much as the global system allows.

We present the ongoing work of a feedback loop design to regulate this injection of jobs in such a way as to avoid overloading of the distributed file system,
which would be detrimental to the global performance,
while self-adapting to variations in load in order to make the best use of available resources.

Talks

datamove-tt talks

2022-07-13 Data visualization and analysis in R

2022-06-10 Towards Reproducible Deep Learning Experiments

2022-05-20 Reconciling high-performance computing with the use of third-party libraries?

2022-04-29 Making MPI dynamic: Dynamic resource management with MPI sessions

2022-04-15 ResPCT: Fast Checkpointing in Non-volatile Memory for Multi-threaded Applications

2022-04-08 Characterization of different user behaviors for demand response in data centers

2022-04-01 Scheduling with Component Health Index: Complexity Study

2022-03-25 Rust for NVRAM: Fine-Grained Access

2022-03-18 Hands-on SILECS/Grid’5000

2022-03-02 Evaluation of simulation models for Batsim

2022-02-23 SIM-SITU : A Framework for the Faithful Simulation of in-situ Workflows

2022-02-16 Affinity-Aware Capacity Planning for Scheduling Long-Running Applications in Shared Clusters

2022-02-09 Enseigner le Cloud

2022-02-02 How can we estimate the energy consumption of a learning algorithm?

2021-12-08 The Dual Fitting Method applied to Two-Agent Scheduling (proof for competitive-algorithm)

2021-12-01 An introduction to Linux kernel programming with eBPF

2021-11-24 Decentralized Meta Frank Wolfe for Online Neural Network Optimization

2021-11-17 Backfilling scheduling with job partitioning

2021-10-20 Elastic Large Scale Ensemble Data Assimilation with Particle Filters for Continental Weather Simulation

2021-09-29 Les bases mathématiques de l’harmonie musicale

2021-09-22 Non-intrusive harvesting of grid idle resources: a control-based approach.

News