Check out this short article “Making the Right Move to Senior Researcher”, to appear in the May 2021 issue of ACM SIGMOD Record, in a new series managed by Professor Tamer Özsu which seeks to provide advice to mid-career researchers.
May 31
“Making the Right Move to Senior Researcher”, by P. Valduriez, May 2021.
- Filed under Slider News
-
May 31, 2021
Permanent link to this article: https://team.inria.fr/zenith/check-out-this-short-article-in-acm-sigmod-record-2021/
May 17
SIGKDD 2021: paper by Reza Akbarinia et al. accepted (research track).
- Filed under Conferences, Slider News
-
May 17, 2021
The paper proposes PBA (Parallel Boundary Aggregator), a novel algorithm that computes incremental aggregations in parallel over massive data streams. The work has been done with Univ. Clermont-Auvergne (postdoc Chao Zhang and Professor Farouk Toumani).
Chao Zhang, Reza Akbarinia, Farouk Toumani. Efficient Incremental Computation of Aggregations over Sliding Windows. Int ACM Conference on Knowledge Discovery and Data Mining (SIGKDD), 2021.
Nowadays, we are witnessing the production of large volumes of continuous or real-time data in many application domains like traffic monitoring, medical monitoring, social networks, weather forecasting, network monitoring, etc. For example, every day around one trillion messages are processed through Uber data analytics infrastructure, and more than 500 million tweets are posted on Twitter. Efficient streaming algorithms are needed for analyzing data streams in such applications. In particular, aggregations having the inherent property of summarizing information from data, constitute a fundamental operator to compute real-time statistics in this context. In the streaming setting, aggregations are typically computed over finite subsets of a stream, called windows. In particular, sliding-window aggregation (SWAG) continuously computes a summary of the most recent data items in a given range r (aka window size) and using a given slide s.
One of the challenges faced by the SWAG algorithms is to incrementally compute aggregations over moving data, i.e., without recomputing the aggregation from scratch after inserting new data items or evicting old data items to/from the window. High throughput and low latency are essential requirements as stream processing systems are typically designed for real-time applications.
In this paper, we propose PBA (Parallel Boundary Aggregator), a novel algorithm that computes incremental aggregations in parallel. PBA groups continuous slices into chunks, and maintains two buffers for each chunk containing, respectively, the cumulative slice aggregations (denoted as csa) and the left cumulative slice aggregations (denoted as lcs) of the chunk’s slices. Using PBA, SWAGs can be computed in constant time for both amortized and worst-case time. We also propose an approach to optimize the chunk size, which guarantees the minimum latency for PBA. We conducted extensive empirical experiments using both synthetic and real-world datasets. Our experiments show that PBA behaves very well for average and large sliding windows (e.g., with sizes higher than 1024 values) compared to the state-of-the-art algorithms. For small-size windows, the results show the superiority of the non-parallel version of PBA (denoted as SBA) that outperforms other algorithms in terms of throughput.
Permanent link to this article: https://team.inria.fr/zenith/sigkdd-2021-paper-by-reza-akbarinia-et-al-accepted-research-track/
May 10
ICML 2021: paper by Antoine Liutkus et al. accepted (as long presentation).
- Filed under Conferences, Slider News
-
May 10, 2021
Permanent link to this article: https://team.inria.fr/zenith/icml-2021-paper-by-antoine-liutkus-et-al-accepted-as-long-presentation/
May 10
Seminar by Patrick Valduriez “Innovation : startup strategies” 20 May 2021.
- Filed under Seminars, Slider News
-
May 10, 2021
The 12th edition of the Marcus Evans “Innovation Strategies” conference will be held virtually from May 19 to 20, 2021.
https://drive.google.com/file/d/1HQ8AHzcPbK3eqap8hqGT43TGcHHYZSEt/preview
Check out my presentation on May 20, 14h40
Innovation : startup strategies
Patrick Valduriez
Inria and LIRMM, Univ. Montpellier, France
Technological innovation as driven by startups is hard to formalize (and manage) as the context may be unknown or quickly changing. To be successful, the innovation process involves not only inventions (new methods) but also context, e.g. user behavior, and timing, e.g. market readiness. In this talk, I illustrate various innovation strategies based on startup success stories, in particular LeanXcale, which delivers a new generation HTAP DBMS product. I also give hints to promote innovation within startups.
Permanent link to this article: https://team.inria.fr/zenith/2597-2/
Permanent link to this article: https://team.inria.fr/zenith/new-video-with-gaetan-heidsieck-esther-pacitti-and-francois-tardieu-the-design-of-digital-agriculture-may-2021/
Mar 18
Esther Pacitti at Online Seminar of Instituto de Computação – UFF on 17 March 2021.
- Filed under Seminars, Slider News
-
March 18, 2021
Esther Pacitti gave an invited lecture at Seminários 2021, Instituto de Computação – UFF, Rio de Janeiro, on “Uma Perspectiva Evolutiva e Multidisciplinar do Tratamento de Dados”.
This work is in the context of the HPDaSc Inria associated team.
Watch the video on YouTube.
Permanent link to this article: https://team.inria.fr/zenith/esther-pacitti-at-online-seminar-at-instituto-de-computacao-uff-on-17-march-2021/
Feb 22
Patrick Valduriez at (Online) University of Paris Seminar Series on Data Analytics, on 25 February 2021.
- Filed under Seminars, Slider News
-
February 22, 2021
Patrick Valduriez at (Online) University of Paris Seminar Series on Data Analytics, in collaboration with the diNo group, on 25 February 2021, 15h.
http://helios.mi.parisdescartes.fr/~themisp/seminars/2021-02-25-Valduriez.html
Distributed Database Systems: the case for NewSQL
NewSQL [Valduriez & Jimenez-Peris 2019] is the latest technology in the big data management landscape, enjoying a fast-growing rate in the DBMS and BI markets. NewSQL combines the scalability and availability of NoSQL with the consistency and usability of SQL. By blending capabilities only available in different kinds of database systems such as fast data ingestion and SQL queries and by providing online analytics over operational data, NewSQL opens up new opportunities in many application domains where real-time decision is critical. Important use cases are eAdvertisement (such as Google Adwords), IoT, performance monitoring, proximity marketing, risk monitoring, real-time pricing, real-time fraud detection, etc. NewSQL may also simplify data management, by removing the traditional separation between NoSQL and SQL (ingest data fast, query it with SQL), as well as between operational database and data warehouse / data lake (no more ETLs!). However, a hard problem is scaling out transactions in mixed operational and analytical (HTAP) workloads over big data, possibly coming from different data stores (HDFS, SQL, NoSQL). Today, only a few NewSQL systems have solved this problem. In this talk, I introduce the solution for scalable transaction and polystore data management in LeanXcale, a recent NewSQL DBMS.
Permanent link to this article: https://team.inria.fr/zenith/patrick-valduriez-at-online-university-of-paris-seminar-series-on-data-analytics-on-25-february-2021-15h/
Nov 25
Prix de l’innovation Inria Académie des Sciences 2020 pour Pl@ntnet
- Filed under Awards, Slider News
-
November 25, 2020
Permanent link to this article: https://team.inria.fr/zenith/prix-plantnet-2020/
Nov 24
Inria Brasil – the web site, 24 November 2020
- Filed under Slider News, Uncategorized
-
November 24, 2020
The Inria Brasil web site is now open.
It reflects the collaboration between Inria and LNCC, the Brazilian National Scientific Computing Laboratory, and associated Brazilian universities in High Performance Computing, Artificial Intelligence, Data Science and Scientific Computing. The collaboration is headed by Frédéric Valentin (LNCC, Inria International Chair) and Patrick Valduriez.
Permanent link to this article: https://team.inria.fr/zenith/inria-brasil-the-web-site/
Nov 03
Patrick Valduriez at (Online) CWI lectures on Database Research, 19 Nov. 2020.
- Filed under Seminars, Slider News
-
November 3, 2020
Patrick Valduriez on “Distributed Database Systems: the case for NewSQL” on 19 Nov. 2020 at (Online) CWI lectures on Database Research.
NewSQL [Valduriez & Jimenez-Peris 2019] is the latest technology in the big data management landscape, enjoying a fast-growing rate in the DBMS and BI markets. NewSQL combines the scalability and availability of NoSQL with the consistency and usability of SQL. By blending capabilities only available in different kinds of database systems such as fast data ingestion and SQL queries and by providing online analytics over operational data, NewSQL opens up new opportunities in many application domains where real-time decision is critical. Important use cases are eAdvertisement (such as Google Adwords), IoT, performance monitoring, proximity marketing, risk monitoring, real-time pricing, real-time fraud detection, etc. NewSQL may also simplify data management, by removing the traditional separation between NoSQL and SQL (ingest data fast, query it with SQL), as well as between operational database and data warehouse / data lake (no more ETLs!). However, a hard problem is scaling out transactions in mixed operational and analytical (HTAP) workloads over big data, possibly coming from different data stores (HDFS, SQL, NoSQL). Today, only a few NewSQL systems have solved this problem. In this talk, I introduce the solution for scalable transaction and polystore data management in LeanXcale, a recent NewSQL DBMS.
Permanent link to this article: https://team.inria.fr/zenith/online-cwi-lectures-on-database-research/
Search
Events
- Patrick Valduriez, keynote speaker at SSDBM 2024.
- HPDaSc Workshop on Data Driven Science, 31 May, 2024, Campus Saint Priest, Bat. 5, Montpellier.
- Seminar by Patrick Valduriez on “Ciência de Dados e Inovação”, IMPA, Rio de Janeiro, 9 May 2024
- Spotlight on PlantNet, 28 February 2024
- Seminar by Patrick Valduriez on “Big Data Technologies”, Inria Paris, 23 November 2023
Calendar
M | T | W | T | F | S | S |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 | 31 |