Journée Zenith, Golfe de Coulondres, 25 May 2022
ML Model Management in Gypscie
Fabio Porto, LNCC, Petropolis, Brazil
To realize the full potential of data science, ML models (or models for short) must be built, combined and ensembled, which can be very complex as there can be many models to select from. Furthermore, they should be shared and reused, in particular, in different execution environments such as HPC or Spark clusters. To address this problem, we propose Gypscie, a new framework that supports the entire ML lifecycle and enables model reuse and import from other frameworks. The approach behind Gypscie is to combine several rich capabilities for model and data management, and model execution, which are typically provided by different tools, in a unique framework. Finally, Gypscie interfaces with multiple execution environments to run ML tasks, e.g., an HPC system such as the Santos Dumont supercomputer at LNCC or a Spark cluster.