Database Design for NoSQL Systems

The paper “Database Design for NoSQL Systems” by Paolo Atzeni, Francesca Bugiotti, Luca Cabibbo and Riccardo Torlone has been accpeted in ER 2014

Permanent link to this article:

Julien Leblay: Querying the deep web: from logic to optimisation, and back

When: Tuesday, May 13, at 13.00

Where: PCRI building, room 435

Who: Julien Leblay, University of Oxford

Title: Querying the deep web: from logic to optimisation, and back

The deep web commonly refers to documents and resources that are not directly accessible through web browsers, crawlers and other types of clients. Data residing behind web forms or web services belong to this class as one generally needs to provide input values to access them. Querying the deep web raises two difficult issues: (i) can a query be answered despite existing access restrictions, (ii) if so, how to find an efficient plan that automatically chooses the right inputs to produce the answer. Although a query may appear to be unanswerable, reasoning on the access restrictions and existing dependencies may uncover ways to answer it. Our approach tackles both problems concurrently by looking for proofs that a query is answerable and building a plan from each of these proofs. Plans are assigned costs to find an optimal one as well as to guide the proof search. I will present ongoing work on the topic, which has connections with data integration, view-based query answering and web service composition.

Permanent link to this article:

DanaC 2014: “PAXQuery: A Massively Parallel XQuery Processor”

“PAXQuery: A Massively Parallel XQuery Processor”, by Jesús Camacho-Rodríguez, Dario Colazzo and Ioana Manolescu was accepted for publication in the “Data Analytics in the Cloud” (DanaC) workshop 2014.

Permanent link to this article:

Experiencing WWW 2014 in Seoul

The OAK team participated to the WWW 2014 conference through:

  • the tutorial “Entity Resolution in the Web of Data” by Kostas Stefanidis (ICS at FORTH), Vasilis Efthymiou (University of Crete), Melanie Herschel (University of Paris South), Vassilis Christophides (University of Crete)
  • the article “RDF Analytics: Lenses over Semantic Graphs” by Dario Colazzo, François Goasdoué, Ioana Manolescu and Alexandra Roatiș

As for the cultural experience, here come the highlights.

A lovely welcome reception:


Wishing the World Wide Web a happy 25th birthday:


Followed by a show featuring traditional dancing:

DSC_0466 DSC_0472 DSC_0491 DSC_0511

DSC_0532 DSC_0565


Enjoying the Korean barbecues:

DSC_0001 DSC_0012 DSC_0015 DSC_0024


Admiring the culture and history:

IMG_0691 IMG_0726 IMG_0759 IMG_0802 IMG_0885 IMG_0899 IMG_0907 IMG_0909


Marveling at the skyline and engaging into a bit of shopping:

IMG_0668 IMG_0848 IMG_0853 IMG_0858 IMG_0998 IMG_1013 IMG_1016 IMG_1022


And of course taking lots of photos 🙂

IMG_0445 IMG_0434DSC_0692 IMG_0805

Permanent link to this article:

TAPP 2014: Immutably Answering Why-Not Questions for Equivalent Conjunctive Queries

The paper “Immutably Answering Why-Not Questions for Equivalent Conjunctive Queries” by Nicole Bidoit, Melanie Herschel, and Katerina Tzompanaki has been accepted at TAPP 2014.

Permanent link to this article:

Alexandra Roatis: RDF Analytics: Lenses over Semantic Graphs

When: Friday, April 4, at 14.00

Where: PCRI building, room 445

Who: Alexandra Roatis

Title: RDF Analytics: Lenses over Semantic Graphs

The development of Semantic Web (RDF) brings new requirements for data analytics tools and methods, going beyond querying to semantics-rich analytics through warehouse-style tools. In this work, we fully redesign, from the bottom up, core data analytics concepts and tools in the context of RDF data, leading to the first complete formal framework for warehouse-style RDF analytics. Notably, we define i) analytical schemas tailored to heterogeneous, semantics-rich RDF graph, ii) analytical queries which (beyond relational cubes) allow flexible querying of the data and the schema as well as powerful aggregation and iii) OLAP-style operations. Experiments on a fully-implemented platform demonstrate the practical interest of our approach.

Permanent link to this article:

Roxana Horincar: From temporal to multidimensional data: refresh strategies and search

When: Friday, March 28, at 14.00

Where: PCRI building, room 445

Who: Roxana Horincar

Title: From temporal to multimensional data: refresh strategies and search

Because of the rapid growth of data sources, services and devices connected to the Internet, online available web content is getting more and more diverse and dynamic, having multiple dimensions. First, data has a temporal nature, each piece of information having associated a time instant (e.g., publication date). Furthermore, it has a social nature, being the result of a collaborative contribution of individuals or communities, it can be localized in a geographical space (e.g., POI) or have textual descriptions attached (e.g., keywords, tags).
In the first part, I address the particular issue of large-scale aggregation of highly dynamic information sources by focusing on the design of optimal refresh strategies for large collections of RSS feed documents.
In the second part, I introduce the problem of query answering on multidimensional (i.e., social, spatial, textual and temporal) data, discussing the context, the challenges and some research directions.

Permanent link to this article:

Francesca Bugiotti: A model oriented approach to heterogeneity

When: Friday, March 21, at 14.00

Where: PCRI building, room 445

Who: Francesca Bugiotti

Title: A model oriented approach to heterogeneity

Data heterogeneity is a major issue in any context where software directly deals with data. The most general expectation of any complex system is the so-called seamless integration, where data can be accessed, retrieved and handled with uniform techniques, tools and algorithms.

The aim of this work is dealing with data heterogeneity and data integration techniques under a number of perspectives.

From the theoretical perspective the core problem of heterogeneity is that data can be intrinsically different because multiple data models are adopted to organize them. Here model management is considered as the framework to formalize model and data translation problems: a schema, instance of a certain model is translated to another schema instance of a target model relying on a model-independent approach based on a general meta-level.

From the performance perspective translations cannot be performed out of the involved systems with an import-translate-export process. The schema and data translation approach has been extended in order to perform runtime translations and automatically generate views of data.

As application example, a model-independent solution to the round-trip engineering problem is illustrated, showing the typical propagation of changes among related schemas.

Nowadays market demand for highly specialized data processors, performing at best in specific cases such as web content retrieval, document search, object serialization, parallel calculation is taken in particular consideration. NoSQL engines promise exceptional performance in non transactional fields and leverage simplified but peculiar data models. Therefore a core goal of data integration is providing techniques and tools to facilitate the interaction with these systems from both a theoretical and technical perspective. A new interface has been defined having as goal to support applications by hiding the heterogeneity of the languages and the interfaces of the various NoSQL systems.

Permanent link to this article:

Alexandra’s profile on INRIA web site

Out of several exchanges with Alexandra, the following short interview was published:

Permanent link to this article:

CityLab INRIA Lab accepted

The CityLab INRIA Project Lab has been accepted for funding by INRIA for a period of 4 years. The project is coordinated by Valérie Issarny (INRIA@Sillicon Valley & Arles-Mimove). The other INRIA project-teams involved are Clime, Dice, Fun, Myriads, OAK, SMIS, Urbanet and Willow. The project will study ICT solutions toward smart cities that promote both social and environmental sustainability.

Permanent link to this article: