–
October 14, 2021
Abstract: The exploration-exploitation dilemma stands as follows: Should you explore new solutions (and risk loosing time or money), or should you exploit existing ones (and risk missing an opportunity). This talsk will serve as a basic introduction to these concepts and their use in reinforcement learning algorithms. We will talk about regret, bandit problems and MCTS.