Name: Kimang KHUN
Contact: kimang.khun@polytechnique.org
About me:
Currently, I am a PhD student supervised by Nicolas GAST and Bruno GAUJAL at University of Grenoble Alpes and I work in Polaris team of LIG and INRIA. My PhD project is about using Reinforcement Learning (RL) algorithms in Multi-Armed Bandit (MAB) problem, in which each arm is a Markov Decision Process (MDP). This problem can be considered as a MDP having a state size that is exponential in the number of arms; which is known as a curse of dimensionality. Our goal is to investigate whether RL algorithms can utilize the structure of MAB problem to escape from the curse of dimensionality during the learning.
Academic background:
10/2019 – Present : PhD student, Université Grenoble Aples, Grenoble, France
10/2018 – 09/2019 : Master’s Degree in Artificial Intelligence and Advanced Visual Computing, Ecole Polytechnique, Palaiseau, France
09/2015 – 09/2018 : Engineer’s Degree in Data Science, Ecole Polytechnique, Palaiseau, France
Professional experiences:
2020-01/2020-06 and 2021-01/2021-06 : Teacher of module INF204 in DLST of University of Grenoble Alpes, Grenoble, France
Teaching L1 students how to program in python
04/2019 – 09/2019 : Research Intern in Polaris team of Inria Grenoble Alpes, Grenoble, France
Investigating the use of Reinforcement Learning for Markovian bandits when the structure of each bandit is unknown.
03/2018 – 08/2018 : Research Intern, Réseau de Transport d’Electricité, Paris, France
Working in Tau team of Inria Saclay, investigating the use of Reinforcement Learning in controlling the high voltage network.
06/2017 – 08/2017 : Engineer Intern, Ontruck, Madrid, Spain
Implementing existing algorithms in the literature for Vehicle Routing Problem with Time table Pick up and Drop off.
Publications:
- Nicolas GAST, Bruno GAUJAL, Kimang KHUN 2022. Computing Whittle (and Gittins) index in Subcubic Time, MMOR, https://hal.inria.fr/hal-03602458/document
- Nicolas GAST, Bruno GAUJAL, Kimang KHUN 2022. Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more Scalable than Optimism?, ECML, https://hal.inria.fr/hal-03262006/document