PhD defense of Mohamed-Amine Baazizi

10.30, room 435, PCRI

Title: “Analyse statique pour l’optimisation des mises à jour de documents XML temporels”


The last decade has witnessed a rapid expansion of XML as a format for representing and exchanging data through the web.  In order to follow this evolution, many languages have been proposed to query, update or transform XML documents. At the same time, a range set of systems allowing to store and process XML documents have been developed. Among these systems, main-memory engines are lightweight systems that are the favoured choice for applications that do not require complex functionalities of traditional DBMS such as secondary storage indexes or transaction management. These engines require loading the documents to be processed entirely into main-memory. Consequently, they suffer from space limitations and are not able to process quite large documents.

In this thesis, we investigate issues related to the evolution of XML documents and to the management of the temporal dimension for XML. This thesis consists of two parts sharing the common goal of developing efficient techniques for processing large XML documents using main-memory engines. The first part investigates the optimization of update for static XML documents. We have developed a technique based on XML projection, a method that has been proposed to overcome the limitations of main-memory engines in the case of querying. We have devised for a new scenario for projection allowing the propagation of the updates effects.

The second part of the thesis investigates the issue of building and maintaining time-stamped XML documents under space limitations. Our contribution consists in two methods. The first method can be applied in the general case where no restriction is made on the evolution of the XML documents. This method is designed to be performed in streaming and allows thus processing large time-stamped  documents. The second method deals with the case where the changes are specified by updates. This method is based on the projection paradigm which allows it for processing large time-stamped  documents and for generating time-stamped documents which are satisfactory from the point of view of storage.

Permanent link to this article: