December 8, 2022
Séminaire GLSI / CtrlA: Quentin Guilloteau (Datamove)
–
December 8, 2022
Seminar Mario Bravo (room 106)
–
December 8, 2022
–
December 8, 2022
MMonday | TTuesday | WWednesday | TThursday | FFriday | SSaturday | SSunday |
---|---|---|---|---|---|---|
28November 28, 2022
|
29November 29, 2022
|
30November 30, 2022
|
December1December 1, 2022●(1 event)Séminaire Stephane Durand – Jeux de contagions et d'influences: les différentes formes, les approches et le contexte |
2December 2, 2022
|
3December 3, 2022
|
4December 4, 2022
|
5December 5, 2022
|
6December 6, 2022
|
7December 7, 2022
|
Séminaire GLSI / CtrlA: Quentin Guilloteau (Datamove) – Bâtiment IMAG (442) Seminar Mario Bravo (room 106) – |
9December 9, 2022
|
10December 10, 2022
|
11December 11, 2022
|
Seminar Matthieu Jonckheere (room 306) – Title: Parameter Selection in Fermat Distances: Navigating Geometry and Noise |
13December 13, 2022
|
14December 14, 2022
|
PhD defense Chen Yan: Poliques quasi-optimal pour les restless bandits. – Thèse supervisée par Nicolas GAST et Bruno GAUJAL.
La soutenance aura lieu le jeudi 15 décembre 2022 à 14h00 à l'amphithéâtre 1 de la Tour IRMA (51 rue des mathématiques, 38610 Gières). Un pot aura lieu après la soutenance à salle 406 du bâtiment IMAG.
Jury:
-- David Alan Goldberg, Professeur associé, Université de Cornell (Rapporteur)
-- Bruno Scherrer, Chargé de recherche, Inria Nancy (Rapporteur)
-- Jérôme Malick, Directeur de recherche, CNRS (Examinateur)
-- Nguyễn Kim Thắng, Professeur, Université Grenoble Alpes (Examinateur)
-- LEGROS Benjamin, Professeur associé, EM Normandie (Examinateur)
Résumé :
Bandits are one of the most basic examples of decision-making with uncertainty. A Markovian restless bandit can be seen as the following sequential allocation problem: At each decision epoch, one or several arms are activated (pulled); all arms generate an instantaneous reward that depend on their state and their activation; the state of each arm then changes in a Markovian fashion, based on an underlying transition matrix. Both the rewards and the probability matrices are known, and the new state is revealed to the decision maker for its next decision. The word restless serves to emphasize the fact that arms that are not activated can also change states, hence generalizes the simpler rested bandits. In principle, the above problem can be solved by dynamic programming, since it is a Markov decision process. The challenge that we face is the curse of dimension, since the size of possible states and actions grows exponentially with the number of arms of the bandit. Consequently, the focus is to design policies that solve the dilemma of computational efficiency and close-to-optimal performance.
In this thesis, we construct computationally efficient policies with provable performance bounds, that may differ depending on certain properties of the problem. We first investigate the classical Whittle index policy (WIP) on infinite horizon problems, and prove that if it is asymptotically optimal under the global attractor assumption, then almost always it converges to the optimal value exponentially fast. The application of WIP has the additional technical assumption of indexability as a prerequisite, to get around this, we next study the LP-index policy, that is well-defined for any problem, and shares the same exponential speed of convergence as WIP under similar assumptions.
In infinite horizon, we always need the global attractor assumption for asymptotic optimality. We next study the problem under finite horizon, so that this assumption is no-longer a concern. Instead, the LP-compatibility and the non-degeneracy are required for the asymptotic optimality and a faster convergence rate. We construct the finite horizon LP-index policy, as well as the LP-update policy, that amounts to solving new LP-index policies during the evolution of the process. This latter LP-update policy is then generalized to the broader framework of weakly coupled MDPs, together with the generalization of the non-degenerate condition. This condition also allows a more efficient implementation of the LP-update policy, as well as a faster convergence rate, if it is satisfied on the weakly coupled MDPs.
Tour IRMA, campus Saint Martin d'Hères |
16December 16, 2022
|
17December 17, 2022
|
18December 18, 2022
|
19December 19, 2022
|
20December 20, 2022
|
21December 21, 2022
|
22December 22, 2022
|
23December 23, 2022
|
24December 24, 2022
|
25December 25, 2022
|
26December 26, 2022
|
27December 27, 2022
|
28December 28, 2022
|
29December 29, 2022
|
30December 30, 2022
|
31December 31, 2022
|
January1January 1, 2023 |
2January 2, 2023
|
3January 3, 2023
|
4January 4, 2023
|
5January 5, 2023
|
6January 6, 2023
|
7January 7, 2023
|
8January 8, 2023
|
9January 9, 2023
|
10January 10, 2023
|
11January 11, 2023
|
12January 12, 2023
|
13January 13, 2023
|
14January 14, 2023
|
15January 15, 2023
|
16January 16, 2023
|
17January 17, 2023
|
18January 18, 2023
|
19January 19, 2023
|
20January 20, 2023
|
21January 21, 2023
|
22January 22, 2023
|
23January 23, 2023
|
24January 24, 2023
|
25January 25, 2023
|
26January 26, 2023
|
27January 27, 2023
|
28January 28, 2023
|
29January 29, 2023
|
30January 30, 2023
|
31January 31, 2023
|
February1February 1, 2023 |
2February 2, 2023
|
3February 3, 2023
|
4February 4, 2023
|
5February 5, 2023
|
This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.
If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.