Internship: Semantic Representations for Interpretable Reinforcement Learning

Title: Semantic Representations for Interpretable Reinforcement Learning

Supervision: Riad Akrour (riad -dot- akrour -at- inria -dot- fr)

When: Spring and Summer 2022


Automating decision making is an appealing prospect of AI. However, the solution produced by an AI system may need to be inspected by a human expert before its deployment in the real world. To this end, the AI needs to return a policy in a format semantically meaningful to the human. In addition, the complexity of the policy needs to be controlled to take into account the bounded capacities of humans. In the machine learning and reinforcement learning literature, there has been an increasing amount of research for learning disentangled and semantically meaningful state representations, and for learning compact and human readable policies.

Within this context, we have two main goals to move the state-of-the-art of interpretable RL forward. These goals can be tailored to the background and preferences of the intern. We expect in all cases the applicant to have good knowledge of RL. If the applicant has in addition an affinity for computer vision, they might focus on the state representation part, learning semantically meaningful (to humans) representations over which interpretable policies are defined. A starting point would be the object-centric decomposition models used in [4], or the similar model used here [3] for interpretable imitation learning. The target application in this case will be Atari games. If the applicant wants to solely focus on RL, they could work on hierarchical RL to learn meaningful abstract action representations over which interpretable policies are defined. The abstract actions are expressed in terms of autonomously discovered sub-goals, for instance similar to [2]. The target application in this case will be the Rubik’s cube, for which interpretable policies are known and are expressed compactly thanks to an abstract representation of the action space.

Overall, one of the main challenges of interpretable RL is the omnipresence of discrete optimization routines. The end goal for both topics is to learn better representations—whether for the state or action space—to make discrete optimization tractable for tasks that are out of reach for current interpretable RL methods. An algorithmic basis that these representations would extend could be our prior work on interpretable RL [1], although we remain open to other interpretable policy representations such as decision trees. Collaboration with the IAS lab at TU Darmstadt, Germany is expected during the internship. For more information on the topic please feel free to contact us or to check out our thesis topic on Scool’s website.


We list below some goals that would advance the state of interpretable RL. Achieving a single one of these goals during the internship would be appreciable, more would be outstanding.

  • Scale interpretable RL algorithms to some Atari games using object-centric image representations.
  • Empirically demonstrate the (supposed) advantage of interpretable RL over interpretable imitation learning, by returning strategies adapted to each desired policy complexity given as input.
  • Phrase an intrinsically motivated objective and derive its associated learning algorithm to discover sub-policies matching known skills required for solving the Rubik’s cube.
  • Learn an interpretable policy for the Rubik’s cube matching known human readable solutions.


  1. R. Akrour, D. Tateo, and J. Peters. Continuous Action Reinforcement Learning from a Mixture ofInterpretable Experts”. In:IEEE Transactions on Pattern Analysis and Machine Intelligence(2021).
  2. Cédric Colas, Pierre Fournier, Olivier Sigaud, Mohamed Chetouani, and Pierre-Yves Oudeyer. CURI-OUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning. In: International Conference on Machine Learning (ICML). 2019.
  3. Guiliang Liu, Xiangyu Sun, Oliver Schulte, and Pascal Poupart. Learning Tree Interpretation fromObject Representation for Deep Reinforcement Learning. In:Conference on Neural Information Processing Systems (NeurIPS). 2021.
  4. Tom Monnier, Elliot Vincent, Jean Ponce, and Mathieu Aubry. Unsupervised Layered Image Decom-position into Object Prototypes. In:ICCV. 2021

Comments are closed.