- Ronan Fruit, Alessandro Lazaric. “Exploration-Exploitation in MDPs with Options”. European Workshop on Reinforcement Learning, 2016.
- A. Erraqabi, A. Lazaric, M. Valko, E. Brunskill, Y-E. Liu. “Trading off Rewards and Errors in Multi-armed Bandit”. Poster presentation at the “Challenges in Machine Learning: Gaming and Education” Workshop.
- A. Erraqabi, A. Lazaric, M. Valko, E. Brunskill, Y-E. Liu. “Trading off Rewards and Errors in Multi-armed Bandit”. Under review.
- Shayan Doroudi, Kenneth Holstein, Vincent Aleven and Emma Brunskill. “Sequence Matters, But How Exactly? A Methodology for Evaluating Activity Sequences from Data”. In Proceedings of the Educational Data Mining (EDM) Conference, 2016.
- Tomáš Kocák, Gergely Neu, Michal Valko. “Online learning with Erdős-Rényi side-observation graphs”, in Proceedings of the Uncertainty in Artificial Intelligence (UAI) Conference.
- Akram Erraqabi, “Error and Regret Trade Off for a Multi Armed Bandit Problem”, Rapport de Stage, ENS, ENSAE, November 2015. [pdf]
- Matlab code for “Error and Regret Trade Off for a Multi Armed Bandit Problem”. [code]