Return to HPDaSc (High Performance Data Science)

HPDaSc Publications

2023

[Akbarinia 2023] Reza Akbarinia, Christophe Botella, Alexis Joly, Florent Masseglia, Marta Mattoso, Eduardo Ogasawara, Daniel de Oliveira, Esther Pacitti, Fabio Porto, Christophe Pradal, Dennis Shasha, Patrick Valduriez. Life Science Workflow Services (LifeSWS): motivations and architecture. Transactions on Large-Scale Data- and Knowledge-Centered Systems, 25 pages, In press, 2023.

[Borges 2023] Heraldo Borges, Antonio Castro, Rafaelli Coutinho, Fabio Porto, Esther Pacitti, Eduardo Ogasawara. STMotif Explorer: A Tool for Spatiotemporal Motif Analysis. Simpósio Brasileiro de Banco de Dados (SBBD), Belo Horizonte, Brazil. 114-119, 2023.

[Castro 2023] Antonio Castro, Heraldo Borges, Cassio Souza, Jorge Rodrigues, Fabio Porto, Esther Pacitti, Rafaelli Coutinho, Eduardo Ogasawara. GSTSM Package: Finding Frequent Sequences in Constrained Space and Time.  39ème Conférence sur la Gestion de Données – Principes, Technologies et Applications (BDA). Montpellier, France,  1-2, 2023.

[Ribeiro 2023] Vitor Ribeiro, Eduardo Pena, Raphael Saldanha, Reza Akbarinia, Patrick Valduriez, Falaah Arif, Julia Stoyanovich, Fabio Porto. Subset Modelling: A Domain Partitioning Strategy for Data-efficient Machine-Learning. Simpósio Brasileiro de Banco de Dados (SBBD), Belo Horizonte, Brazil. 318-323, 2023.

[Rosendo 2023a] Daniel Rosendo, Kate Keahey, Alexandru Costan, Matthieu Simonin, Patrick Valduriez, Gabriel Antoniu. KheOps: Cost-effective Repeatability, Reproducibility, and Replicability of Edge-to-Cloud Experiments. ACM Conference on Reproducibility and Replicability (REP), 2023, 62-73, 2023.

[Rosendo 2023b] Daniel Rosendo, Marta Mattoso, Alexandru Costan, Renan Souza, Débora Pina, Patrick Valduriez, Gabriel Antoniu. ProvLight: Efficient Workflow Provenance Capture on the Edge-to-Cloud Continuum. IEEE International Conference on Cluster Computing (Cluster), 1-13, 2023.

[Rosendo 2023c] Daniel Rosendo. Methodologies for Reproducible Analysis of Workflows on the Edge-to-Cloud Continuum. Ph.D. thesis, University of Rennes, 1 June 2023, Best thesis award (2nd place), 39ème Conférence sur la Gestion de Données – Principes, Technologies et Applications (BDA), Montpellier, France. PhD advisors: Gabriel Antoniu, Alexandru Costan, Patrick Valduriez.

[Salles 2023a] Rebecca Salles, Janio Lima, Rafaelli Coutinho, Esther Pacitti, Florent Masseglia, Reza Akbarinia, Chao Chen, Jonathan M. Garibaldi, Fábio Porto, Eduardo S. Ogasawara.  SoftED: Metrics for Soft Evaluation of Time Series Event Detection. 2023.

[Salles 2023b] Rebecca Salles, Esther Pacitti, Eduardo Bezerra, Celso Marques, Carla Pacheco, Carla Oliveira, Fabio Porto, Eduardo Ogasawara. TSPredIT: Integrated Tuning of Data Preprocessing and Time Series Prediction Models. Transactions on Large-Scale Data- and Knowledge-Centered Systems (TLDKS):41–55, 2023.

[Salles 2023c] Rebecca Salles. Online Event Detection over Nonstationary Time Series. Ph.D. thesis, CEFET/RJ, Rio de Janeiro, 12 September 2023. PhD advisors: Eduardo  Ogasawara, Fabio Porto.

[Zorrilla 2023] Rocío Zorrilla, Eduardo Ogasawara, Patrick Valduriez, Fabio Porto. A Data-Driven Model Selection Approach to Spatio-Temporal Prediction. 39ème Conférence sur la Gestion de Données – Principes, Technologies et Applications (BDA), Montpellier, France,  1-12, 2023.

2022

[Chaves da Silva 2022] Anderson Chaves da Silva, Patrick Valduriez, Fabio Porto. Integrating Machine Learning Model Ensembles to the SAVIME Database System.  Simpósio Brasileiro de Banco de Dados, (SBBD), Buzios, Brazil. 232-238, 2022.

[Lima 2022] Janio Lima, Pedro Alpis, Rebecca Salles, Luciana Escobar, Fabio Porto, Esther Pacitti, Rafaelli Coutinho, Eduardo Ogasawara. Forward and Backward Inertial Anomaly Detector: A Novel Time Series
Event Detection Method. IEEE International Joint Conference on Neural Networks (IJCNN), 2022.

[Pena 2022] Eduardo H. M. Pena, Fábio Porto, Felix Naumann. Fast Algorithms for Denial Constraint Discovery. Proc. VLDB Endowment 16(4): 684-696, 2022.

[Porto 2022] Fabio Porto, Patrick Valduriez. Data and Machine Learning Model Management with Gypscie. CARLA 2022 – Workshop on HPC and Data Sciences meet Scientific Computing, SCALAC,  Porto Alegre, Brazil, 1-2, 2022.

[Rosendo 2022a] Daniel Rosendo, Alexandru Costan, Patrick Valduriez, Gabriel Antoniu. Distributed intelligence on the Edge-to-Cloud Continuum: A systematic literature review. Journal of Parallel and Distributed Computing, Elsevier,166, 71-94, 2022.

[Rosendo 2022b] Daniel Rosendo, Alexandru Costan, Gabriel Antoniu, Patrick Valduriez. Reproducible Performance Optimization of Complex Applications on the Edge-to-Cloud Continuum. 38ème Conférence sur la Gestion de Données – Principes, Technologies et Applications (BDA), Clermont-Ferrand, France, 2022.

[Salles 2022a] Rebecca Salles, Esther Pacitti, Eduardo Bezerra, Fabio Porto, Eduardo Ogasawara. TSPred: A framework for nonstationary time series prediction. Neurocomputing, Elsevier, 467, 197-202, 2022.

[Salles 2022b] Rebecca Salles, Esther Pacitti, Eduardo Bezerra, Fabio Porto, Eduardo Ogasawara. TSPred: A framework for nonstationary time series prediction. 38ème Conférence sur la Gestion de Données – Principes, Technologies et Applications (BDA), Clermont-Ferrand, France, 2022.

[Souza 2022] Renan Souza, Leonardo Azevedo, Vítor Lourenço, Elton Soares, Raphael Thiago, Rafael Brandão, Daniel Civitarese, Emilio Brazil, Marcio Moreno, Patrick Valduriez, Marta Mattoso, Renato Cerqueira, Marco Netto. Workflow Provenance in the Lifecycle of Scientific Machine Learning. Concurrency and Computation: Practice and Experience, 34 (14), pp.e6544, 2022.

[Zorrilla 2022] Rocío Zorrilla, Eduardo Ogasawara, Patrick Valduriez, Fabio Porto. A Data-Driven Model Selection Approach to Spatio-Temporal Prediction. Simpósio Brasileiro de Banco de Dados (SBBD), Buzios, Brazil. 1-12, 2022.

2021

[Borges 2021a] Heraldo Borges, Reza Akbarinia, Florent Masseglia. Anomaly Detection in Time Series. Transactions on Large-Scale Data- and Knowledge-Centered Systems, Springer, 17 pages, In press, 2021.

[Borges 2021b] Heraldo Borges. Discovering Motifs Restricted in Space-Time. Ph. D. thesis, CEFET-RJ, Rio de Janeiro, Brazil, 2021. PhD advisors: Eduardo Ogasawara, Esther Pacitti.

[Castro 2021] Antonio Castro, Heraldo Borges, Ricardo Campisano, Esther Pacitti, Fabio Porto, Rafaelli Coutinho, Eduardo Ogasawara. Generalização de Mineração de Sequências Restritas no Espaço e no Tempo. Simpósio Brasileiro de Banco de Dados (SBBD), Online, Brazil. 313-318, 2021.

[Heidsieck 2021] Gaëtan Heidsieck, Daniel de Oliveira, Esther Pacitti, Christophe Pradal, Francois Tardieu, Patrick Valduriez. Cache-aware scheduling of scientific workflows in a multisite cloud. Future Generation Computer Systems (FGCS), 172-186, 2021.

[Kunstmann 2021] Liliane Kunstmann, Debora Pina, Filipe Silva, Aline Paes, Patrick Valduriez, Daniel de Oliveira, Marta Mattoso. Online Deep Learning Hyperparameter Tuning based on Provenance Analysis. Journal of Information and Data Management, 12(5):396-414, 2021.

[Lustosa 2021] Hermano Lustosa, Anderson C.  Silva, Daniel N. R. da Silva, Fabio Porto, Patrick Valduriez. SAVIME: An Array DBMS for Simulation Analysis and ML Models Prediction. Journal of Information and Data Management, 11 (3):247-264, 2021.

[Pina 2021] Débora Pina, Liliane Kunstmann, Daniel de Oliveira, Patrick Valduriez, Marta Mattoso. Provenance Supporting Hyperparameter Analysis in Deep Neural Networks. International Provenance and Annotation Workshop (IPAW), 2021.

[Rosendo 2021a] Daniel Rosendo, Alexandru Costan, Gabriel Antoniu, Patrick Valduriez. Reproducible Performance Optimization of Complex Applications on the Edge-to-Cloud Continuum. IEEE International Conference on Cluster Computing (Cluster), 23-34, 22021.

[Rosendo 2021b] Daniel Rosendo, Alexandru Costan, Gabriel Antoniu, Patrick Valduriez. Enabling Reproducible Analysis of Complex Workflows on the Edge-to-Cloud Continuum. 37ème Conférence sur la Gestion de Données – Principes, Technologies et Applications (BDA), Paris, France, 2021.

[Silva 2021a] Rodrigo Silva, Esther Pacitti, Yuri Frota, Daniel de Oliveira. Análise de Desempenho da Distribuição de Workflows Científicos em Nuvens com Restrições de Confidencialidade. Workshop on Computer and Communication Systems Performance (WPerformance 2021), Online, Brazil. 12, 2021.

[Silva 2021b] Daniel Silva, Esther Pacitti, Aline Paes, Daniel de Oliveira. Provenance-and machine learning-based recommendation of parameter values in scientific workflows. PeerJ Computer Science, PeerJ, 7, pp.e606, 2021.

[Silva 2021c] Rômulo Silva, Debora Pina, Liliane Kunstmann,  Daniel de Oliveira, Patrick Valduriez, Alvaro L.G.A. Coutinho, Marta Mattoso. Capturing Provenance to Improve the Model Training of PINNs: first hands-on experiences with Grid5000. Pan-American Congress on Computational Mechanics, CILAMCE-PANACM, 1-7, 2021.

[Souza 2021] Renan Souza, Vitor Silva, Alexandre Lima, Daniel de Oliveira, Patrick Valduriez, Marta Mattoso. Distributed in-memory data management for workflow executions. PeerJ Computer Science, PeerJ, 2021.

2020

[Borges 2020a] Heraldo Borges, Murillo Dutra, Amin Bazaz, Rafaelli Coutinho, Fabio Perosi, Fabio Porto, Florent Masseglia, Esther Pacitti, Eduardo Ogasawara. Spatial-time motifs discovery. Intelligent Data Analysis, 24, 1121-1140, 2020.

[Borges 2020b] Heraldo Borges, Amin Bazaz, Esther Pacitti, Eduardo Ogasawara. STMotif: Discovery of Motifs in Spatial-Time Series. https://cran.r-project.org/web/packages/STMotif, 2020.

[Lemus 2020]  Noel Lemus, Fábio Porto, Yania Souto, Rafael Pereira, Ji Liu, Esther Pacitti, Patrick Valduriez. SUQ$2$: Uncertainty Quantification Queries over Large Spatio-temporal Simulations. IEEE Data Engineering Bulletin 43(1), 47-59, 2020.

[Liu 2020] Ji Liu, Noel Moreno Lemus, Esther Pacitti, Fábio Porto, Patrick Valduriez. Parallel Computation of PDFs on Big Spatial Data Using Spark. Distributed and Parallel Databases, Springer, 38, 63-100, 2020.

[Heidsieck 2020a]  Gaëtan Heidsieck, Daniel de Oliveira, Esther Pacitti, Christophe Pradal, Francois Tardieu, Patrick Valduriez. Efficient Execution of Scientific Workflows in the Cloud Through Adaptive Caching. Transactions on Large-Scale Data-and Knowledge-Centered Systems (TLDKS), 44, 41-66, 2020.

[Heidsieck 2020b]  Gaëtan Heidsieck, Daniel de Oliveira, Esther Pacitti, Christophe Pradal, Francois Tardieu, Patrick Valduriez. Distributed Caching of Scientific Workflows in Multisite Cloud. 31st International Conference on Database and Expert Systems Applications (DEXA), 51-65, 2020. Best paper award.

[Pina 2020]  Débora Pina, Liliane Kunstmann, Daniel de Oliveira, Patrick Valduriez, Marta Mattoso. Uma abordagem para coleta e análise de dados de configurações em redes neurais profundas. Simpósio Brasileiro de Banco de Dados, Virtual, Brazil. 1-6, 2020.

[Souza 2020a]  Renan Souza, Vitor Silva, Alvaro L.G.A. Coutinho, Patrick Valduriez, Marta Mattoso. Data reduction in scientific workflows using provenance monitoring and user steering. Future Generation Computer Systems (FGCS), Elsevier, 110, 481-501, 2020.

[Souza 2020b]  Renan Souza, Leonardo Guerreiro Azevedo, Vítor Lourenço, Elton F. S. Soares, Raphael Thiago, Rafael Brandão, Daniel Civitarese, Emilio Vital Brazil, Márcio Ferreira Moreno, Patrick Valduriez, Marta Mattoso, Renato Cerqueira, Marco A. S. Netto. Workflow Provenance in the Lifecycle of Scientific Machine Learning. CoRRabs/2010.00330, 2020).

[Silva 2020]  Vítor Silva, Vinícius Campos, Thaylon Guedes, José Camata, Daniel de Oliveira, Alvaro Coutinho, Patrick Valduriez, Marta Mattoso. DfAnalyzer: Runtime dataflow analysis tool for Computational Science and Engineering applications. SoftwareX, Elsevier, 12, pp.100592, 2020.

Permanent link to this article: https://team.inria.fr/zenith/hpdasc/publications/