Our approach is to capitalize on the principles of distributed and parallel data management. In particular, we exploit: high-level languages as the basis for data independence and automatic optimization; data semantics to improve information retrieval and automate data integration; declarative languages (algebra, calculus) to manipulate data and workflows; and highly distributed and parallel environments such as P2P, cluster and cloud. To reflect our approach, we organize our research program in four complementary themes:
- data search, including including machine learning, recommendation and content-based image retrieval;
- data analytics, including scientific workflows and data mining;
- data integration, including data capture and cleaning;
- distributed data management, in particular, storage, indexing and privacy.
- Data science, big data, scientific data
- Cluster, cloud, peer to peer
- Distributed and parallel data management, data integration,data privacy, data analytics, machine learning, data search, content-based image retrieval