This project takes place in the general context of information extraction from massive satellite data using advanced machine learning tools. The continuous proliferation and improvement of satellite data sensors yields a huge volume of Earth’s images with high spatial, spectral and temporal resolution (up to 50cm/pixel, 50 bands, twice per day, covering the full planet!). These data open the door to a large range of important applications, such as the monitoring of natural disasters, the planning of urban environments and precision agriculture. However, petabytes of these massive images are stored in binary files as the raw data, unstructured and for a big part never used.
The goal of this PhD thesis is to devise a novel effective summary representation for satellite images, which would help to structure large-scale data. Such a representation must be highly generic to be applicable for images from all over the world and suitable for a wide range of applications. At the same time, it it has to best represent the meaningful objects in the image scenes, with the possibility to be enriched by semantic or other kinds of information.
Our primary goal is to design a representation which can accurately convey information from the image with as few primitives as possible, i.e. we seek for a compact yet faithful representation with good complexity/fidelity tradeoffs.
The vector-based representation is well-known to provide several important advantages over raster images, the three most salient ones being compactness, scalability and easiness of updating. We thus plan to devise a multi-resolution vector-based representation, together with the required algorithms for the efficient generation and manipulation.
To preserve geometric structures and their accurate scale alignment, we will investigate the use of a large source of free-access maps, such as OpenStreetMap, and the advanced deep learning architectures to learn about the geometric structures and their relations in the image scene. The OpenStreetMap collaborative database provides large amounts of maps over the Earth, delineating (a contour) or pointing out (one spatial point) at objects of numerous semantic categories, such as roads, buildings, residential areas, parking lots, trees, etc. These data will serve as a training dataset for a convolutional neural network-based semantic segmentation to detect the changes in the existing maps, such as new or disappeared roads or buildings. The updated maps will then be used to infer both structure and semantics of the objects in the image (naturally the ones present in the maps).
More details on this PhD thesis position here