Calendar

Name: TAPIOCA : Une bibliothèque d'agrégation de données pour les I/O parallèles prenant en compte la topologie, François Tessier, Argonne
Start: 2017-12-21T14:00:00+02:00
End: 2017-12-21T15:00:00+02:00
Location: Bâtiment IMAG

Events in November–December 2017

Journée de rentrée commune POLARIS/DATAMOVE

Journée de rentrée commune POLARIS/DATAMOVE

1:15 pm – 4:30 pm
November 6, 2017

Bâtiment IMAG (amphitheater)
Saint-Martin-d'Hères, 38400
France
Map Bâtiment IMAG (amphitheater)

Read more
Keynote LIG: George Wright - State of the Art Media Research

Keynote LIG: George Wright - State of the Art Media Research

2:00 pm – 3:00 pm
November 9, 2017

See LIG webpage.

Read more
Detecting Performance Outliers for Task-based HPC Applications in multi-[CPU|GPU|Node] clusters By Lucas Schnorr (Porto Allegre)

Detecting Performance Outliers for Task-based HPC Applications in multi-[CPU|GPU|Node] clusters By Lucas Schnorr (Porto Allegre)

2:00 pm – 3:00 pm
November 16, 2017

Detecting Performance Outliers for Task-based HPC Applications in
multi-[CPU|GPU|Node] clusters

Programming paradigms in High-Performance Computing have
been shifting towards task-based models which are capable of
adapting readily to heterogeneous and scalable
supercomputers. Detecting performance outliers in such environments
is particularly difficult because it must consider architecture
heterogeneity and variability. In this work we present how we have
employed a very simple performance model to highlight task outliers
of the well-known tiled-based dense Cholesky factorization running
on top of StarPU-MPI, a runtime for task-based applications. Such
work has been integrated into our visualization framework based on the
R programming language and the tidyverse meta-package. Experiments
have been conducted in a controlled environment using the Chifflet
cluster at Lille, part of the Grid'5000 infrastructure, using up to
eight nodes, each one equipped with 28 cores and two GPUs. The
preliminary results, derived from collected traces, indicate that
explicit binding for the MPI and GPU-managing threads, within
StarPU, alleviate the issue, leading to performance gains.

Bâtiment IMAG (406)
Saint-Martin-d'Hères, 38400
France
Map Bâtiment IMAG (406)

Read more
Inria 50th year Celebration

Inria 50th year Celebration

1:00 pm – 5:00 pm
November 23, 2017

Read more
Convergence d’algorithme de non regret, Amélie Heliou (Polaris)

Convergence d’algorithme de non regret, Amélie Heliou (Polaris)

2:00 pm – 3:00 pm
November 30, 2017

Les algorithmes de non-regret sont souvent utilisés dans les jeux répétés où les joueurs ont peu d’information sur le jeu auquel ils jouent. Ces algorithmes garantissent que le regret de chaque joueur est sous-linéaire. La moyenne temporelle des stratégies choisies en suivant un algorithme de non-regret converge dans l’ensemble des équilibres corrélés. Cependant cela ne donne aucune information sur la convergence de la séquence de stratégies.
Nous nous sommes intéressés à la question « est-ce que la sequence de stratégie obtenue pas un algorithme de non regret converge vers un équilibre de Nash? ».
Dans cet exposé, je présenterai un algorithme de non regret appelé Hedge qui est une version d’algorithmes à poids exponentiels. En particulier, je discuterai la convergence des séquences de stratégies obtenues par Hedge en utilisant deux types d’informations accessibles aux joueurs.

Bâtiment IMAG (442)
Saint-Martin-d'Hères, 38400
France
View Location
Map Bâtiment IMAG (442)

Read more
Keynote

Keynote

December 7, 2017

Read more
Learning efficient Nash equilibra in distributed systems by Bary Pradelski (ETH Zurich)

Learning efficient Nash equilibra in distributed systems by Bary Pradelski (ETH Zurich)

2:30 pm – 3:30 pm
December 14, 2017

Learning efficient Nash equilibra in distributed systems

with H. Peyton Young

An individual’s learning rule is completely uncoupled if it does not depend directly on the actions or payoffs of anyone else. We propose a variant of log linear learning that is completely uncoupled and that selects an efficient (welfare-maximizing) pure Nash equilibrium in all generic n-person games that possess at least one pure Nash equilibrium. In games that do not have such an equilibrium, there is a simple formula that expresses the long-run probability of the various disequilibrium states in terms of two factors: i) the sum of payoffs over all agents, and ii) the maximum payoff gain that results from a unilateral deviation by some agent. This welfare/stability trade-off criterion provides a novel framework for analyzing the selection of disequilibrium as well as equilibrium states in n-person games.

Bâtiment IMAG (306)
Saint-Martin-d'Hères, 38400
France
Map Bâtiment IMAG (306)

Read more
Autotuning MPI Collectives using Performance Guidelines, Sascha Hunold

Autotuning MPI Collectives using Performance Guidelines, Sascha Hunold

2:00 pm – 3:00 pm
December 18, 2017

MPI collective operations provide a standardized interface for performing data movements within a group of processes. The efficiency
of collective communication operations depends on the actual algorithm, its implementation, and the specific communication problem
(type of communication, message size, and number of processes).
Many MPI libraries provide numerous algorithms for specific collective operations. The strategy for selecting an efficient algorithm
is often times predefined (hard-coded) in MPI libraries, but some of
them, such as Open MPI, allow users to change the algorithm manually. Finding the best algorithm for each case is a hard problem, and
several approaches to tune these algorithmic parameters have been
proposed. We use an orthogonal approach to the parameter-tuning
of MPI collectives, that is, instead of testing individual algorithmic
choices provided by an MPI library, we compare the latency of
a specific MPI collective operation to the latency of semantically
equivalent functions, which we call the mock-up implementations.
The structure of the mock-up implementations is defined by selfconsistent performance guidelines. The advantage of this approach
is that tuning using mock-up implementations is always possible,
whether or not an MPI library allows users to select a specific algorithm at run-time. We implement this concept in a library called
PGMPITuneLib, which is layered between the user code and the
actual MPI implementation. This library selects the best-performing
algorithmic pattern of an MPI collective by intercepting MPI calls
and redirecting them to our mock-up implementations. Experimental results show that PGMPITuneLib can significantly reduce the
latency of MPI collectives, and also equally important, that it can
help identifying the tuning potential of MPI libraries.

Read more
TAPIOCA : Une bibliothèque d'agrégation de données pour les I/O parallèles prenant en compte la topologie, François Tessier, Argonne

TAPIOCA : Une bibliothèque d'agrégation de données pour les I/O parallèles prenant en compte la topologie, François Tessier, Argonne

2:00 pm – 3:00 pm
December 21, 2017

TAPIOCA : Une bibliothèque d'agrégation de données pour les I/O
parallèles prenant en compte la topologie

L'augmentation de la puissance de calcul des supercalculateurs engendre
un coût considérable des mouvements de données. En outre, la majorité
des simulations scientifiques ont des besoins importants en terme de
lecture et d'écriture sur les systèmes de fichiers parallèles. De
nombreuses solutions logicielles ont été développées pour contenir le
goulot d'étranglement causé par les I/O. Une stratégie bien connue dans
le monde des opérations collectives d'I/O consiste à sélectionner un
sous-ensemble des processus de l'application pour agréger des morceaux
de données contiguës avant d'effectuer les lectures et écritures. Dans
cet exposé, je présenterai TAPIOCA, une bibliothèque MPI implémentant un
algorithme d’agrégation de données optimisé prenant en compte la
topologie. Je montrerai les gains de performance substantiels en lecture
et écriture que nous avons obtenus sur deux supercalculateurs présents à
Argonne National Laboratory. Pour terminer, j'aborderai nos travaux
actuels dans TAPIOCA afin de tirer parti des nouveaux niveaux de mémoire
et de stockage disponibles sur les systèmes actuels et à venir (MCDRAM,
SSD locaux, ...).

Bâtiment IMAG
Saint-Martin-d'Hères, 38400
France
Map Bâtiment IMAG

Read more

Comments are closed.

News
- Journée au vert POLARIS 2022/05/23
- DATAMOVE/POLARIS picnic 2021/06/22
- DATAMOVE/POLARIS BBQ 2019 2019/06/14
- POLARIS Bootcamp (May 2019) 2019/05/24
- slides of Andras Gyorgy 2016/01/15
Next seminars

Events in November–December 2017

M	T	W	T	F	S	S
30	31	November 1	2	3	4	5
6 Journée de rentrée commune POLARIS/DATAMOVE 1:15 pm – 4:30 pm November 6, 2017 Bâtiment IMAG (amphitheater) Saint-Martin-d'Hères, 38400 France Map Read more	7	8	9 Keynote LIG: George Wright - State of the Art Media Research 2:00 pm – 3:00 pm November 9, 2017 See LIG webpage. Read more	10	11	12
13	14	15	16 Detecting Performance Outliers for Task-based HPC Applications in multi-[CPU\|GPU\|Node] clusters By Lucas Schnorr (Porto Allegre) 2:00 pm – 3:00 pm November 16, 2017 Detecting Performance Outliers for Task-based HPC Applications in multi-[CPU\|GPU\|Node] clusters Programming paradigms in High-Performance Computing have been shifting towards task-based models which are capable of adapting readily to heterogeneous and scalable supercomputers. Detecting performance outliers in such environments is particularly difficult because it must consider architecture heterogeneity and variability. In this work we present how we have employed a very simple performance model to highlight task outliers of the well-known tiled-based dense Cholesky factorization running on top of StarPU-MPI, a runtime for task-based applications. Such work has been integrated into our visualization framework based on the R programming language and the tidyverse meta-package. Experiments have been conducted in a controlled environment using the Chifflet cluster at Lille, part of the Grid'5000 infrastructure, using up to eight nodes, each one equipped with 28 cores and two GPUs. The preliminary results, derived from collected traces, indicate that explicit binding for the MPI and GPU-managing threads, within StarPU, alleviate the issue, leading to performance gains. Bâtiment IMAG (406) Saint-Martin-d'Hères, 38400 France Map Read more	17	18	19
20	21	22	23 Inria 50th year Celebration 1:00 pm – 5:00 pm November 23, 2017 Read more	24	25	26
27	28	29	30 Convergence d’algorithme de non regret, Amélie Heliou (Polaris) 2:00 pm – 3:00 pm November 30, 2017 Les algorithmes de non-regret sont souvent utilisés dans les jeux répétés où les joueurs ont peu d’information sur le jeu auquel ils jouent. Ces algorithmes garantissent que le regret de chaque joueur est sous-linéaire. La moyenne temporelle des stratégies choisies en suivant un algorithme de non-regret converge dans l’ensemble des équilibres corrélés. Cependant cela ne donne aucune information sur la convergence de la séquence de stratégies. Nous nous sommes intéressés à la question « est-ce que la sequence de stratégie obtenue pas un algorithme de non regret converge vers un équilibre de Nash? ». Dans cet exposé, je présenterai un algorithme de non regret appelé Hedge qui est une version d’algorithmes à poids exponentiels. En particulier, je discuterai la convergence des séquences de stratégies obtenues par Hedge en utilisant deux types d’informations accessibles aux joueurs. Bâtiment IMAG (442) Saint-Martin-d'Hères, 38400 France View Location Map Read more	December 1	2	3
4	5	6	7 Keynote December 7, 2017 Read more	8	9	10
11	12	13	14 Learning efficient Nash equilibra in distributed systems by Bary Pradelski (ETH Zurich) 2:30 pm – 3:30 pm December 14, 2017 Learning efficient Nash equilibra in distributed systems with H. Peyton Young An individual’s learning rule is completely uncoupled if it does not depend directly on the actions or payoffs of anyone else. We propose a variant of log linear learning that is completely uncoupled and that selects an efficient (welfare-maximizing) pure Nash equilibrium in all generic n-person games that possess at least one pure Nash equilibrium. In games that do not have such an equilibrium, there is a simple formula that expresses the long-run probability of the various disequilibrium states in terms of two factors: i) the sum of payoffs over all agents, and ii) the maximum payoff gain that results from a unilateral deviation by some agent. This welfare/stability trade-off criterion provides a novel framework for analyzing the selection of disequilibrium as well as equilibrium states in n-person games. Bâtiment IMAG (306) Saint-Martin-d'Hères, 38400 France Map Read more	15	16	17
18 Autotuning MPI Collectives using Performance Guidelines, Sascha Hunold 2:00 pm – 3:00 pm December 18, 2017 MPI collective operations provide a standardized interface for performing data movements within a group of processes. The efficiency of collective communication operations depends on the actual algorithm, its implementation, and the specific communication problem (type of communication, message size, and number of processes). Many MPI libraries provide numerous algorithms for specific collective operations. The strategy for selecting an efficient algorithm is often times predefined (hard-coded) in MPI libraries, but some of them, such as Open MPI, allow users to change the algorithm manually. Finding the best algorithm for each case is a hard problem, and several approaches to tune these algorithmic parameters have been proposed. We use an orthogonal approach to the parameter-tuning of MPI collectives, that is, instead of testing individual algorithmic choices provided by an MPI library, we compare the latency of a specific MPI collective operation to the latency of semantically equivalent functions, which we call the mock-up implementations. The structure of the mock-up implementations is defined by selfconsistent performance guidelines. The advantage of this approach is that tuning using mock-up implementations is always possible, whether or not an MPI library allows users to select a specific algorithm at run-time. We implement this concept in a library called PGMPITuneLib, which is layered between the user code and the actual MPI implementation. This library selects the best-performing algorithmic pattern of an MPI collective by intercepting MPI calls and redirecting them to our mock-up implementations. Experimental results show that PGMPITuneLib can significantly reduce the latency of MPI collectives, and also equally important, that it can help identifying the tuning potential of MPI libraries. Read more	19	20	21 TAPIOCA : Une bibliothèque d'agrégation de données pour les I/O parallèles prenant en compte la topologie, François Tessier, Argonne 2:00 pm – 3:00 pm December 21, 2017 TAPIOCA : Une bibliothèque d'agrégation de données pour les I/O parallèles prenant en compte la topologie L'augmentation de la puissance de calcul des supercalculateurs engendre un coût considérable des mouvements de données. En outre, la majorité des simulations scientifiques ont des besoins importants en terme de lecture et d'écriture sur les systèmes de fichiers parallèles. De nombreuses solutions logicielles ont été développées pour contenir le goulot d'étranglement causé par les I/O. Une stratégie bien connue dans le monde des opérations collectives d'I/O consiste à sélectionner un sous-ensemble des processus de l'application pour agréger des morceaux de données contiguës avant d'effectuer les lectures et écritures. Dans cet exposé, je présenterai TAPIOCA, une bibliothèque MPI implémentant un algorithme d’agrégation de données optimisé prenant en compte la topologie. Je montrerai les gains de performance substantiels en lecture et écriture que nous avons obtenus sur deux supercalculateurs présents à Argonne National Laboratory. Pour terminer, j'aborderai nos travaux actuels dans TAPIOCA afin de tirer parti des nouveaux niveaux de mémoire et de stockage disponibles sur les systèmes actuels et à venir (MCDRAM, SSD locaux, ...). Bâtiment IMAG Saint-Martin-d'Hères, 38400 France Map Read more	22	23	24
25	26	27	28	29	30	31

Meta

Calendar

Events in November–December 2017

Category: Seminars Journée de rentrée commune POLARIS/DATAMOVE

Category: Seminars Keynote LIG: George Wright - State of the Art Media Research

Category: Seminars Detecting Performance Outliers for Task-based HPC Applications in multi-[CPU|GPU|Node] clusters By Lucas Schnorr (Porto Allegre)

Category: Seminars Inria 50th year Celebration

Category: Seminars Convergence d’algorithme de non regret, Amélie Heliou (Polaris)

Category: Seminars Keynote

Category: Seminars Learning efficient Nash equilibra in distributed systems by Bary Pradelski (ETH Zurich)

Category: Seminars Autotuning MPI Collectives using Performance Guidelines, Sascha Hunold

Category: Seminars TAPIOCA : Une bibliothèque d'agrégation de données pour les I/O parallèles prenant en compte la topologie, François Tessier, Argonne

News

Next seminars

Events

Events in November–December 2017

November

Category: Seminars Journée de rentrée commune POLARIS/DATAMOVE

Category: Seminars Keynote LIG: George Wright - State of the Art Media Research

Category: Seminars Detecting Performance Outliers for Task-based HPC Applications in multi-[CPU|GPU|Node] clusters By Lucas Schnorr (Porto Allegre)

Category: Seminars Inria 50th year Celebration

Category: Seminars Convergence d’algorithme de non regret, Amélie Heliou (Polaris)

December

Category: Seminars Keynote

Category: Seminars Learning efficient Nash equilibra in distributed systems by Bary Pradelski (ETH Zurich)

Category: Seminars Autotuning MPI Collectives using Performance Guidelines, Sascha Hunold

Category: Seminars TAPIOCA : Une bibliothèque d'agrégation de données pour les I/O parallèles prenant en compte la topologie, François Tessier, Argonne

Meta

Journée de rentrée commune POLARIS/DATAMOVE

Keynote LIG: George Wright - State of the Art Media Research

Detecting Performance Outliers for Task-based HPC Applications in multi-[CPU|GPU|Node] clusters By Lucas Schnorr (Porto Allegre)

Inria 50th year Celebration

Convergence d’algorithme de non regret, Amélie Heliou (Polaris)

Keynote

Learning efficient Nash equilibra in distributed systems by Bary Pradelski (ETH Zurich)

Autotuning MPI Collectives using Performance Guidelines, Sascha Hunold

TAPIOCA : Une bibliothèque d'agrégation de données pour les I/O parallèles prenant en compte la topologie, François Tessier, Argonne

Journée de rentrée commune POLARIS/DATAMOVE

Keynote LIG: George Wright - State of the Art Media Research

Detecting Performance Outliers for Task-based HPC Applications in multi-[CPU|GPU|Node] clusters By Lucas Schnorr (Porto Allegre)

Inria 50th year Celebration

Convergence d’algorithme de non regret, Amélie Heliou (Polaris)

Keynote

Learning efficient Nash equilibra in distributed systems by Bary Pradelski (ETH Zurich)

Autotuning MPI Collectives using Performance Guidelines, Sascha Hunold

TAPIOCA : Une bibliothèque d'agrégation de données pour les I/O parallèles prenant en compte la topologie, François Tessier, Argonne