This week, our team had the pleasure of attending Ghufran’s PhD defense on his thesis, “Scalable Analytics on Multi-Streams Dynamic Graphs.”
Those who know Ghufran also know he’s not one to draw attention to himself. But the clarity and depth of his defense spoke volumes. It was a fitting conclusion to years of thoughtful, rigorous work, and we’re proud to have witnessed it.
Congratulations Ghufran! We’re excited to see where your path will take you, and we’re confident that the same thoughtful approach that carried you through your PhD will shape whatever comes next.
Thesis Supervisor
Ioana Manolescu, Senior Researcher, Inria Saclay & École Polytechnique (co-supervisor)
Angelos-Christos Anadiotis, Consulting Member of Technical Staff, Oracle (co-supervisor)
Defense Jury
Professor Angela Bonifati, Professor, Lyon 1 University, France
Professor Dario Colazzo, Université Paris-Dauphine, France (reviewer)
Associate Professor Stefania Dumbrava, ENSIIE – Télécom SudParis, France
Professor George Fletcher, Eindhoven University of Technology, Netherlands (reviewer)
Professor Silviu Maniu, Professor, Université Grenoble Alpes, France
Professor Yannis Velegrakis, Utrecht University, Netherlands
Thesis Abstract
Many real-time applications, such as financial platforms, social networks, and transportation systems, rely on dynamic graphs built from multiple data streams that are highly dynamic and may arrive out of order. Supporting scalable ingestion with concurrent analytics, ensuring accuracy under out-of-order (ooo) updates, enabling efficient queries on present and historical snapshots, and handling both structural and property updates are major challenges. This thesis introduces HAL, a novel in-memory dynamic graph database engine that addresses these issues by processing updates from multiple streams while supporting concurrent and consistent analytics. At its core, HAL employs the Stream-Time Ordered Adjacency List (STAL), a data structure that maintains updates in stream-time order regardless of their arrival order, combined with a lightweight graph-aware multi-version concurrency control protocol for consistency. Experiments show that HAL achieves ingestion throughput of more than 7.5 million updates per second (up to 73× faster than existing systems) and delivers up to 357× speedups in analytical queries even under ooo updates. Beyond the core system, HAL supports historical queries that reconstruct graphs at any past stream-time moment, and property-rich graphs by allowing arbitrary properties on nodes and edges.