2019 Scientific Progress
During the first months of Pierre’s postdoc we have been working on the exploration of new methods for reordering that do not require an expensive calibration phase nor global synchronizations. The first lead explored was to reorder messages statically. Different reordering strategies using different criteria (message sizes, distance between nodes) have been implemented and tested on the Cori supercomputer at NERSC. However, the results obtained have shown that due to the noise on the network and the high variability of communication times, the optimal ordering cannot be determined statically. The second lead to explore is to reorder messages dynamically in order to adapt the reordering decision to the state of the network and to the contention on the nodes. The dynamic strategy has yet to be defined and tuned, but ongoing work has shown potential performance benefits.
Hiding the latency of MPI operations – Many applications employ blocking operations (point-to-point and collectives) whereas they could use the non-blocking ones. During Summer 2019, an intern worked on an analysis able to transform existing applications to use non-blocking collectives. We have obtained promising results so far. A PhD started in November 2019 to push further on this subject. This will be a collaborative project with the CEA (French Alternative Energies and Atomic Energy Commission).
Correctness of MPI 3.0 one-sided communication – A PhD started in March 2019 on the development of a method to help developers using MPI one-sided communications in their applications.