Our approach is to capitalize on the principles of distributed and parallel data management. In particular, we exploit: high-level languages as the basis for data independence and automatic optimization; declarative languages to manipulate data and workflows; and highly distributed and parallel environments such as cluster and cloud for scalability and performance. We also exploit machine learning, probabilities and statistics for high-dimensional data processing, data analytics and data search.
- Distributed data management, including data integration and scientific workflows
- Big data management and parallel data management
- Data analytics, including data mining and statistics
- Machine learning for high-dimensional data processing
Key-words
- Data science, big data, scientific data
- Cluster, cloud, peer to peer
- Distributed and parallel data management, data integration, data privacy, data analytics, machine learning, data searching, content-based image retrieval