–
March 16, 2017
Task-based runtimes with opportunistic scheduling are necessary to
efficiently extract performance from modern HPC architectures, with
many multi-core multi-GPU nodes. The stochastic and unstructured
nature of such dynamic systems turn the identification of performance
problems particularly hard to attain, especially if one wants to
evaluate if improvements are still possible. We present and
demonstrate visual analysis techniques to evaluate the performance of
task-based applications on hybrid multi-node architectures. Our
approach is based on the composition of modern data analysis tools
(pjdump, R, ggplot2, plotly), enabling an agile scripting framework
with minor development cost and maximum flexibility. We validate our
proposal by analyzing traces from the full-fledged implementation of
the Cholesky decomposition available in the MORSE library running on a
hybrid (CPU/GPU) multi-node platform using the StarPU+MPI framework.