Scientific Program

Education can transform an individual’s capacity and the opportunities available to him. The proposed work will build on and develop novel machine learning approaches towards improving human learning. Massive open online classes (MOOCs) are enabling many more people to access education, but mostly operate using status quo teaching methods. Even more important than access is the opportunity for online software to radically improve the efficiency, engagement and effectiveness of education. Existing intelligent tutoring systems (ITSs) have had some promising successes, but mostly rely on learning sciences research to construct (often quite successful) hand-built
strategies for automated teaching. Online systems make it possible to actively collect substantial amount of data about how people learn, and offer a huge opportunity to substantially accelerate progress in improving education.

An essential aspect of teaching is providing the right learning experience for the student, but it is often unknown a priori exactly how this should be achieved. This challenge can often be cast as an instance of textbf{decision-making under uncertainty}. For example, Brunskill and colleagues have previously demonstrated that casting the choice of which level to give next in an educational game in order to optimize engagement as an off-policy textbf{reinforcement learning} (RL) problem allowed them to find a new policy that performed 30% better than baseline approaches when tested with 2000 new students (Mandel et al. 2014).

In general, one of the distinctive problems of RL is how to explore different strategies and, at the same time, incrementally adapt the behavior in response of the feedback received from the environment (e.g., the results of a student). This problem is at the core of the textbf{multi-armed bandit framework} (MAB). In recent years, constant
advancement in the research in MAB led to the development of algorithms that have been successfully applied in a number domains such as textit{recommendation systems} where it is crucial to incrementally adapt to users’ preferences (a problem similar to assigning suitable learning activities to different students). The proposed collaboration is thus intended to explore the potential interactions of the fields of online education and RL and MAB. On the one hand, we will define novel RL and MAB settings and problems in online education. On the other hand,
we will investigate how solutions developed in RL and MAB could be integrated in ITS and MOOCs and improve their effectiveness.

Scientific Program

Meta