Mercredi 31 janvier, 11h, Salle 2/124
A methodology for capturing and analyzing dataflow paths in computational simulations
Vitor Silva, COPPE/UFRJ, Rio de Janeiro
Scientific applications in large-scale are based on the execution of complex computational models in a specific field of the science. Moreover, a huge volume of scientific data is commonly generated and stored in data sources, which can be raw data files or in-memory data structures. In this context, domain specialists often need to analyze part of these scientific data to validate their scientific hypotheses. Besides the analysis of single data sources, they also need to relate scientific data from different data sources and to perform analysis during the execution of scientific application, since it may take days or weeks, even in high performance computing environments. Therefore, it is important a solution that enables scientific and provenance data extraction (for providing dataflow monitoring) and online dataflow analysis support. According to this exploratory scientific data analysis scenario, we propose a methodology for capturing and analyzing dataflow paths from scientific applications based on the modeling of the dataflow, scientific data, and queries.