Variational meta-reinforcement learning for social robotics

by Anand Ballou, Xavier Alameda-Pineda, and Chris Reinke

Applied Intelligence

Abstract: With the increasing presence of robots in our everyday environments, improving their social skills is of utmost importance. Nonetheless, social robotics still faces many challenges. One bottleneck is that robotic behaviors often need to be adapted, as social norms depend strongly on the environment. For example, a robot should navigate more carefully around patients in a hospital than around workers in an office. In this work, we investigate meta-reinforcement learning (metaRL) as a potential solution. Here, robot behaviors are learned via reinforcement learning, where a reward function needs to be chosen so that the robot learns an appropriate behavior for a given environment. We propose to use a variational meta-RL procedure that quickly adapts the robots’ behavior to new reward functions. As a result, in a new environment, different reward functions can be quickly evaluated and an appropriate one selected. The procedure learns a vectorized representation for reward functions and a meta-policy that can be conditioned to such a representation. Given observations from a new reward function, the procedure identifies its representation and conditions the metapolicy to it. While investigating the procedure’s capabilities, we realized that it suffers from posterior collapse where only a subset of the dimensions in the representation encode useful information, resulting in reduced performance. Our second contribution, a radial basis function (RBF) layer, partially mitigates this negative effect. The RBF layer lifts the representation to a higher dimensional space, which is more easily exploitable for the meta-policy. We demonstrate the interest of the RBF layer and the usage of meta-RL for social robotics in four robotic simulation tasks.