Calendar

January 24, 2019

Realistic simulation of the execution of applications deployed on large distributed systems with a focus on improving file management, by Anchen Chai (Insa Lyon)

Category: Seminars Realistic simulation of the execution of applications deployed on large distributed systems with a focus on improving file management, by Anchen Chai (Insa Lyon)


January 24, 2019

Simulation is a powerful tool to study distributed systems. It allows researchers to evaluate different scenarios in a reproducible manner, which is hardly possible in real experiments. However, the realism of simulations is rarely investigated in the literature, leading to a questionable accuracy of the simulated metrics. In this context, the main aim of our work is to improve the realism of simulations with a focus on file transfer in a large distributed production system (i.e., the EGI federated e-Infrastructure (EGI)). Then, based on the findings obtained from realistic simulations, we can propose reliable recommendations to improve file management in the Virtual Imaging Platform (VIP).
In order to realistically reproduce certain behaviors of the real system in simulation, we need to obtain an inside view of it. Therefore, we collect and analyze a set of execution traces of one particular application executed on EGI via VIP. The realism of simulations is investigated with respect to two main aspects in this thesis: the simulator and the platform model.
Based on the knowledge obtained from traces, we design and implement a simulator to provide a simulated environment as close as possible to the real execution conditions for file transfers on EGI. A complete description of a realistic platform model is also built by leveraging the information registered in traces. The accuracy of our platform model is evaluated by confronting the simulation results with the ground truth of real transfers. Our proposed model is shown to largely outperform the state-of-the-art model to reproduce the real-life variability of file transfers on EGI.
Finally, we cross-evaluate different file replication strategies by simulations using an enhanced state-of-the-art model and our platform model built from traces. Simulation results highlight that the instantiation of the two models leads to different qualitative decisions of replication, even though they reflect a similar hierarchical network topology. Last but not least, we find that selecting sites hosting a large number of executed jobs to replicate files is a reliable recommendation to improve file management of VIP. In addition, adopting our proposed dynamic replication strategy can further reduce the duration of file transfers except for extreme cases (very poorly connected sites) that only our proposed platform model is able to capture.

Bâtiment IMAG (442)
Saint-Martin-d'Hères, 38400
France

Comments are closed.