Modélisation et implémentation du produit de matrice parallèle avec minimisation des communications, by Thomas Lambert
Modélisation et implémentation du produit de matrice parallèle avec minimisation des communications, by Thomas Lambert
– April 11, 2019
Avec l’émergence du calcul haute-performance (HPC) et des applications Big Data, de nouvelles problématiques cruciales sont apparues. Parmi elles on trouve le problème du transfert de données, c’est-à-dire des communications entre machines, qui peut générer des délais lors de gros calculs en plus d’avoir un impact sur la consommation énergétique. Dans cette présentation nous nous intéresserons à la réduction des communications pour un problème particulier : le produit de matrices. Dans un premier temps nous nous intéressons à eux modélisation théoriques, basées respectivement sur le partitionnement d'un carré et d'un cube, ainsi qu'au algorithmes d’approximations existant pour résoudre ce problème. Dans un second temps nous nous intéresserons à la mise en application de ces algorithmes avec une implémentation pratique du produit de matrice sur une plate-forme hétérogène.
Modélisation et implémentation du produit de matrice parallèle avec minimisation des communications, by Thomas Lambert
– April 11, 2019
Avec l’émergence du calcul haute-performance (HPC) et des applications Big Data, de nouvelles problématiques cruciales sont apparues. Parmi elles on trouve le problème du transfert de données, c’est-à-dire des communications entre machines, qui peut générer des délais lors de gros calculs en plus d’avoir un impact sur la consommation énergétique. Dans cette présentation nous nous intéresserons à la réduction des communications pour un problème particulier : le produit de matrices. Dans un premier temps nous nous intéressons à eux modélisation théoriques, basées respectivement sur le partitionnement d'un carré et d'un cube, ainsi qu'au algorithmes d’approximations existant pour résoudre ce problème. Dans un second temps nous nous intéresserons à la mise en application de ces algorithmes avec une implémentation pratique du produit de matrice sur une plate-forme hétérogène.
One can only gain by replacing EASY Backfilling: A simple scheduling policies case study, by Salah Zrigui (Datamove)
– May 9, 2019
High-Performance Computing (HPC) platforms are
growing in size and complexity. In order to improve the quality
of service of such platforms, researchers are devoting a great
amount of effort to devise algorithms and techniques to improve
different aspects of performance such as energy consumption,
total usage of the platform, and fairness between users. In spite
of this, system administrators are always reluctant to deploy
state of the art scheduling methods and most of them revert
to EASY-backfilling, also known as EASY-FCFS (EASY-First-
Come-First-Served). Newer methods frequently are complex and
obscure and the simplicity and transparency of EASY are too
important to sacrifice.
In this work, we used execution logs from five HPC platforms
to compare four simple scheduling policies: FCFS, Shortest esti-
mated Processing time First (SPF), Smallest Requested Resources
First (SQF), and Smallest estimated Area First (SAF). Using
simulations, we performed a thorough analysis of the cumulative
results for up to 180 weeks and considered three scheduling
objectives: waiting time, slowdown and per-processor slowdown.
We also evaluated other effects, such as the relationship between
job size and slowdown, the distribution of slowdown values,
and the number of backfilled jobs, for each HPC platform and
scheduling policy.
We conclude that one can only gain by replacing EASY-
backfilling with SAF with backfilling, as it offers improvements
in performance by up to 80% in the slowdown metric while main-
taining the simplicity and the transparency of FCFS. Moreover,
SAF reduces the number of jobs with large slowdowns and the
inclusion of a simple thresholding mechanism guarantees that no
starvation occurs. Finally, we propose SAF as a new benchmark
for future scheduling studies.
Using Big Data Solutions to Improve HPC systems, by Thomas Ropars (LIG)
– May 16, 2019
Supercomputers are producing large amount of data that need to be analyzed. They produce mostly two kinds of data: scientific data and monitoring data. Scientific data are the results of the execution of numerical simulations and need to be analyzed to extract knowledge. Monitoring data are produced by all kinds of sensors and software components, and can be analyzed to detect, among other things, reliability and performance issues. Considering the scale of such systems, the amount of data to process is huge and analyzing these data with short response time is often necessary. Using techniques and algorithms coming from the Big Data community seems appealing in this context.
This talk will present some of our efforts in trying to apply Big Data and Machine Learning techniques in the HPC context. It will cover 3 main topics: i) The use of Apache Spark Streaming as a tool for in-situ data analysis; ii) The analysis of time series to predict CPU overheating issues in Supercomputers; iii) The application of classification algorithms to the placement problem in NUMA platforms.
Modeling, Prediction and Optimization of Energy Consumption of MPI Applications using SimGrid, by Christian Heinrich (PhD defense)
– May 21, 2019
The High-Performance Computing (HPC) community is currently undergoing
disruptive technology changes in almost all fields, including a switch
towards massive parallelism with several thousand compute cores on a
single GPU or accelerator and new, complex networks.
The energy consumption of these machines will continue to grow in the
future, making energy one of the principal cost factors of machine
ownership. This explains why even the classic
metric "flop/s", generally used to evaluate HPC applications and
machines, is widely regarded as to be replaced by an energy-centric metric "flop/watt".
One approach to predict energy consumption is
through simulation, however, an accurate simulation of the system is
crucial to estimate the energy faithfully. In this thesis, we
contribute to the performance and energy prediction of HPC architectures.
We propose an energy model which we have implemented in the open
source SimGrid simulator. We validate this model by carefully and
systematically comparing it with real experiments.
We leverage this contribution to both evaluate existing and propose
new DVFS governors that are particularly designed to suit the HPC context.
Committee:
Martin SCHULZ, Professor, Technical University of Munich
Laurent LEFÈVRE, Research Scientist, Inria / ENS Lyon
Amina GUERMOUCHE, Assistant Professor, Télécom SudParis
Jean-François MÉHAUT, Professor, Grenoble-Alpes University
Arnaud LEGRAND, Senior Research Scientist, CNRS, PhD Advisor
This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Strictly Necessary Cookies
Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.
If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.