Nowadays there are plenty of scenarios, where autonomous robots meet humans or other robots, and where a safe supervision is required for machines with different levels of collaboration. For example, self-driving cars on the streets, mobile robots in modern automated factories or platoons of flying vehicles provide a general image of such scenarios. All of them have common characteristics dealing with presence of humans (or human controlled agents) together with autonomous robots, and also with uncertainty in the models describing other participants and in the goals they are pursuing, with highly varying dynamic environments and a need in harmless and precise coordination of all mobile robots. The goal of this work is to design algorithms for robot navigation in a vastly uncertain conditions, with the presence of other robots and humans, which penalize the risks of an error and its cost. The complexity of the posed problem requires interdisciplinary approaches for its solution, as considered in this project, which belongs to an intersection of machine learning and automatic control domains. It is intended to combine the reinforcement learning and model predictive control approaches for realization of path-planning algorithms.
Supervisors (contact persons)
Regulation of dynamical systems is a mature area of research having many methods that demonstrate a lot of successful applications in different domains. These methods mainly have a common feature: they are oriented on a complete analytic analysis of the controlled system performances by fixing the admissible conditions on the plant’s model properties and environment. Due to these strict imposed requirements the conventional control theory is often focused on the problems of stabilization of a point or a trajectory under a sufficient level of confidence on the background. However, if amount of uncertainty of the system dynamics and the surrounding elements is rather high, while it is required to solve a more sophisticated regulation task, then the existing approaches cannot be applied, and new ideas and tools coming from the domain of artificial intelligence (AI) become popular [1-5]. For examples of such situations we can consider the self-driving cars on the streets or highways, mobile robots in modern automated factories or platoons of flying vehicles. All these cases can be featured by the presence of humans together with autonomous robots sharing the common operating space. In addition, there is uncertainty in the models describing other participants, while the objectives they are pursuing are also unknown. These characteristics form a complex and varying environment, where it is needed to guarantee a safe and accurate coordination of mobile robots in the context of dynamically evolving obstacles. It is worth highlighting that the robots may collaborate or synchronize their maneuvers (e.g., to make a formation or to safely pass a crossroad), and we can group them by the kind of admissible or achievable intercommunication.
Therefore, the main aim of this PhD project is to design algorithms for autonomous robot navigation in complex and uncertain conditions, with the presence of other robots and humans. The occurrence of the latters increases the risks of an error while augmenting its probable cost. The difficulty of the posed problem requires interdisciplinary approaches for its solution, as considered in this project, which belongs to an intersection of machine learning and automatic control domains, extending [6-8].
In this thesis it is planned to focus on the problem of trajectory planning for a mobile robot in the scenarios described above. First, it is necessary to look for admissible models for other agents in different cases (humans, robots, cars), for methods for their robust estimation and identification. Second, robust predictors and observers have to be designed for the localization of other participants. Third, the path-planning algorithms have to be synthesized (the principal objective), which have to navigate a robot in selected scenarios. The path decision process has to take into account the related uncertainty of the environment and minimize the risks of a collision (primarily with humans) by possibly taking into account rational behavior of other agents. Communication strategies should be analyzed, which can help in synchronization and collaboration for the autonomous robots.
Different approaches for reinforcement learning have to be investigated and applied, together with model predictive control tools, robust and interval estimators/predictors, robust identifications methods. For this project there is a platform with several mobile robots and quadrotors at Inria, which can be used for experiments and validation.
- Bellemare, M.G., Candido, S., Castro, P.S. et al.Autonomous navigation of stratospheric balloons using reinforcement learning. Nature 588, 77–82, 2020.
- Mnih, V., Kavukcuoglu, K., Silver, D. et al.Human-level control through deep reinforcement learning. Nature 518, 529–533, 2015.
- Lillicrap, T., Hunt, J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
- Kiran, B.R., Sobh, I., Talpaert, V., Mannion, P., Sallab, A.A.A., Yogamani, S., Pérez, P. Deep reinforcement learning for autonomous driving: A survey, 2020.
- Nirmala, G., Geetha, S., Selvakumar, S. Mobile Robot Localization and Navigation in Artificial Intelligence: Survey. Computational Methods in Social Sciences 4(2), 12–22, 2016.
- Leurent, E., Maillard, O.-A., Efimov, D. Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs. NeurIPS, Vancouver, 2020.
- Leurent, E., Efimov, D., Maillard, O.-A. Robust-Adaptive Interval Predictive Control for Linear Uncertain Systems. 59th IEEE CDC, Republic of Korea, 2020.
- Leurent, E., Blanco, Y., Efimov, D., Maillard, O.-A. Approximate Robust Control of Uncertain Dynamical Systems. 32nd NeurIPS, Montréal, 2018.
The candidate should be familiar with basic methods in the AI theory and in the control theory (Lyapunov stability, observer design, model predictive control), as well as reinforcement learning (Markov decision processes, bandits).