Publications – SequeL

Publications HAL du labo/EPI SequeL

2022

Conference papers

titre: On the role of population heterogeneity in emergent communication
auteur: Mathieu Rita, Florian Strub, Jean-Bastien Grill, Olivier Pietquin, Emmanuel Dupoux
article: ICLP 2022 – Tenth International Conference on Learning Representations, Apr 2022, Los Angeles, United States
Accès au texte intégral et bibtex

2020

Journal articles

titre: Temperature Decreases Spread Parameters of the New Covid-19 Case Dynamics
auteur: Jacques Demongeot, Yannis Flet-Berliac, Hervé Seligmann
article: Biology, 2020, 9 (5), pp.94. ⟨10.3390/biology9050094⟩
Accès au bibtex

titre: Machine learning applications in drug development
auteur: Clémence Réda, Emilie Kaufmann, Andrée Delahaye-Duriez
article: Computational and Structural Biotechnology Journal, 2020, 18, pp.241-252. ⟨10.1016/j.csbj.2019.12.006⟩
Accès au texte intégral et bibtex

Conference papers

titre: Robust-Adaptive Interval Predictive Control for Linear Uncertain Systems
auteur: Edouard Leurent, Denis Efimov, Odalric-Ambrym Maillard
article: CDC 2020 – 59th IEEE Conference on Decision and Control, Dec 2020, Jeju Island / Virtual, South Korea
Accès au texte intégral et bibtex

titre: Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs
auteur: Edouard Leurent, Denis Efimov, Odalric-Ambrym Maillard
article: NeurIPS 2020 – 34th Conference on Neural Information Processing Systems, Dec 2020, Vancouver / Virtual, Canada
Accès au texte intégral et bibtex

titre: Monte-Carlo Graph Search: the Value of Merging Similar States
auteur: Edouard Leurent, Odalric-Ambrym Maillard
article: ACML 2020 – 12th Asian Conference on Machine Learning, Nov 2020, Bangkok / Virtual, Thailand. pp.577 – 602
Accès au texte intégral et bibtex

titre: A Machine of Few Words Interactive Speaker Recognition with Reinforcement Learning
auteur: Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin
article: Conference of the International Speech Communication Association (INTERSPEECH), Oct 2020, Shanghai, China. ⟨10.21437/Interspeech.2020-2892⟩
Accès au texte intégral et bibtex

titre: A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players
auteur: Etienne Boursier, Emilie Kaufmann, Abbas Mehrabian, Vianney Perchet
article: AISTATS 2020 – 23rd International Conference on Artificial Intelligence and Statistics, Aug 2020, Palermo, Italy
Accès au texte intégral et bibtex

titre: Gamification of pure exploration for linear bandits
auteur: Rémy Degenne, Pierre Ménard, Xuedong Shang, Michal Valko
article: ICML 2020 – International Conference on Machine Learning, Aug 2020, Vienna / Virtual, Austria
Accès au texte intégral et bibtex

titre: I’m sorry Dave, I’m afraid I can’t do that” Deep Q-Learning From Forbidden Actions
auteur: Mathieu Seurin, Philippe Preux, Olivier Pietquin
article: Internationnal Joint Conference on Neural Networks, Jul 2020, Glasgow, United Kingdom
Accès au texte intégral et bibtex

titre: Only Relevant Information Matters: Filtering Out Noisy Samples to Boost RL
auteur: Yannis Flet-Berliac, Philippe Preux
article: IJCAI 2020 – International Joint Conference on Artificial Intelligence, Jul 2020, Yokohama, Japan. ⟨10.24963/ijcai.2020/376⟩
Accès au texte intégral et bibtex

titre: The Influence of Shape Constraints on the Thresholding Bandit Problem
auteur: James Cheshire, Pierre Ménard, Alexandra Carpentier
article: COLT 2020 – Thirty Third Conference on Learning Theory, Jul 2020, Graz / Virtual, Austria. pp.1228-1275
Accès au texte intégral et bibtex

titre: Tightening Exploration in Upper Confidence Reinforcement Learning
auteur: Hippolyte Bourel, Odalric-Ambrym Maillard, Mohammad Sadegh Talebi
article: International Conference on Machine Learning, Jul 2020, Vienna, Austria
Accès au texte intégral et bibtex

titre: Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling
auteur: Cindy Trinh, Emilie Kaufmann, Claire Vernade, Richard Combes
article: ALT 2020 – 31st International Conference on Algorithmic Learning Theory, Feb 2020, San Diego, United States. pp.1 – 28
Accès au texte intégral et bibtex

titre: Covariance-adapting algorithm for semi-bandits with application to sparse outcomes
auteur: Pierre Perrault, Vianney Perchet, Michal Valko
article: Conference on Learning Theory, 2020, Graz, Austria
Accès au texte intégral et bibtex

titre: Fixed-confidence guarantees for Bayesian best-arm identification
auteur: Xuedong Shang, Rianne de Heide, Emilie Kaufmann, Pierre Ménard, Michal Valko
article: International Conference on Artificial Intelligence and Statistics, 2020, Palermo, Italy
Accès au texte intégral et bibtex

titre: Budgeted online influence maximization
auteur: Pierre Perrault, Jennifer Healey, Zheng Wen, Michal Valko
article: International Conference on Machine Learning, 2020, Vienna, Austria
Accès au texte intégral et bibtex

Theses

titre: Safe and Efficient Reinforcement Learning for Behavioural Planning in Autonomous Driving
auteur: Edouard Leurent
article: Computer Science [cs]. Université de Lille, 2020. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Multimodal and Interactive Models for Visually Grounded Language Learning
auteur: Florian Strub
article: Neural and Evolutionary Computing [cs.NE]. Université de Lille; École doctorale, ED SPI 074 : Sciences pour l’Ingénieur, 2020. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: Adversarial Attacks on Linear Contextual Bandits
auteur: Evrard Garcelon, Baptiste Roziere, Laurent Meunier, Jean Tarbouriech, Olivier Teytaud, Alessandro Lazaric, Matteo Pirotta
article: 2020
Accès au bibtex

titre: Stochastic bandits with vector losses: Minimizing $\ell^\infty$-norm of relative losses
auteur: Xuedong Shang, Han Shao, Jian Qian
article: 2020
Accès au texte intégral et bibtex

titre: Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications
auteur: Sarah Perrin, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Romuald Elie, Olivier Pietquin
article: 2020
Accès au bibtex

titre: Optimal Strategies for Graph-Structured Bandits
auteur: Hassan Saber, Pierre Ménard, Odalric-Ambrym Maillard
article: 2020
Accès au texte intégral et bibtex

titre: Forced-exploration free Strategies for Unimodal Bandits
auteur: Hassan Saber, Pierre Ménard, Odalric-Ambrym Maillard
article: 2020
Accès au texte intégral et bibtex

2019

Journal articles

titre: Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits
auteur: Alexander Luedtke, Emilie Kaufmann, Antoine Chambaz
article: Machine Learning, 2019, 108 (11), pp.1919-1949. ⟨10.1007/s10994-019-05799-x⟩
Accès au texte intégral et bibtex

titre: Accurate reconstruction of EBSD datasets by a multimodal data approach using an evolutionary algorithm
auteur: Marie-Agathe Charpagne, Florian Strub, Tresa M. Pollock
article: Materials Characterization, 2019, 150, pp.184-198. ⟨10.1016/j.matchar.2019.01.033⟩
Accès au bibtex

titre: DPPy: Sampling Determinantal Point Processes with Python
auteur: Guillaume Gautier, Rémi Bardenet, Michal Valko
article: Journal of Machine Learning Research, 2019
Accès au texte intégral et bibtex

Conference papers

titre: MERL: Multi-Head Reinforcement Learning
auteur: Yannis Flet-Berliac, Philippe Preux
article: Deep Reinforcement Learning Workshop, NeurIPS, Dec 2019, Vancouver, Canada
Accès au texte intégral et bibtex

titre: Learning Multiple Markov Chains via Adaptive Allocation
auteur: Mohammad Sadegh Talebi, Odalric-Ambrym Maillard
article: Advances in Neural Information Processing Systems 32 (NIPS 2019), Dec 2019, Vancouver, Canada
Accès au texte intégral et bibtex

titre: Budgeted Reinforcement Learning in Continuous State Space
auteur: Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric-Ambrym Maillard, Olivier Pietquin
article: Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada
Accès au texte intégral et bibtex

titre: Regret Bounds for Learning State Representations in Reinforcement Learning
auteur: Ronald Ortner, Matteo Pirotta, Ronan Fruit, Alessandro Lazaric, Odalric-Ambrym Maillard
article: Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada
Accès au texte intégral et bibtex

titre: Model-Based Reinforcement Learning Exploiting State-Action Equivalence
auteur: Mahsa Asadi, Mohammad Sadegh Talebi, Hippolyte Bourel, Odalric-Ambrym Maillard
article: ACML 2019, Proceedings of Machine Learning Research, Nov 2019, Nagoya, Japan. pp.204 – 219
Accès au texte intégral et bibtex

titre: Energy Management for Microgrids: a Reinforcement Learning Approach
auteur: Tanguy Levent, Philippe Preux, Erwan Le Pennec, Jordi Badosa, Gonzague Henri, Yvan Bonnassieux
article: ISGT-Europe 2019 – IEEE PES Innovative Smart Grid Technologies Europe, Sep 2019, Bucharest, France. pp.1-5, ⟨10.1109/ISGTEurope.2019.8905538⟩
Accès au texte intégral et bibtex

titre: Practical Open-Loop Optimistic Planning
auteur: Edouard Leurent, Odalric-Ambrym Maillard
article: European Conference on Machine Learning, Sep 2019, Würzburg, Germany
Accès au texte intégral et bibtex

titre: Non-asymptotic analysis of a sequential rupture detection test and its application to non-stationary bandits
auteur: Lilian Besson, Emilie Kaufmann
article: GRETSI 2019 – XXVIIème Colloque francophone de traitement du signal et des images, Aug 2019, Lille, France
Accès au texte intégral et bibtex

titre: On two ways to use determinantal point processes for Monte Carlo integration — Long version
auteur: Guillaume Gautier, Rémi Bardenet, Michal Valko
article: NeurIPS 2019 – Thirty-third Conference on Neural Information Processing Systems, Jun 2019, Vancouver, Canada
Accès au texte intégral et bibtex

titre: A simple dynamic bandit algorithm for hyper-parameter tuning
auteur: Xuedong Shang, Emilie Kaufmann, Michal Valko
article: Workshop on Automated Machine Learning at International Conference on Machine Learning, AutoML@ICML 2019 – 6th ICML Workshop on Automated Machine Learning, Jun 2019, Long Beach, United States
Accès au texte intégral et bibtex

titre: On two ways to use determinantal point processes for Monte Carlo integration
auteur: Guillaume Gautier, R. Bardenet, Michal Valko
article: NEGDEPML 2019 – ICML Workshop on Negative Dependence in ML, Jun 2019, Long Beach, CA, United States
Accès au texte intégral et bibtex

titre: Decentralized Spectrum Learning for IoT Wireless Networks Collision Mitigation
auteur: Christophe Moy, Lilian Besson
article: ISIoT 2019 – 1st International Workshop on Intelligent Systems for the Internet of Things, May 2019, Santorin, Greece
Accès au texte intégral et bibtex

titre: Finding the bandit in a graph: Sequential search-and-stop
auteur: Pierre Perrault, Vianney Perchet, Michal Valko
article: 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019), Apr 2019, Okinawa, Japan
Accès au texte intégral et bibtex

titre: GNU Radio Implementation of MALIN: “Multi-Armed bandits Learning for Internet-of-things Networks
auteur: Lilian Besson, Remi Bonnefoi, Christophe Moy
article: IEEE WCNC 2019 – IEEE Wireless Communications and Networking Conference, Apr 2019, Marrakech, Morocco. ⟨10.1109/WCNC.2019.8885841⟩
Accès au texte intégral et bibtex

titre: Upper-Confidence Bound for Channel Selection in LPWA Networks with Retransmissions
auteur: Remi Bonnefoi, Lilian Besson, Julio Manco-Vasquez, Christophe Moy
article: The 1st International Workshop on Mathematical Tools and technologies for IoT and mMTC Networks Modeling, Philippe Mary, Samir Perlaza, Petar Popovski, Apr 2019, Marrakech, Morocco
Accès au texte intégral et bibtex

titre: Scale-free adaptive planning for deterministic dynamics & discounted rewards
auteur: Peter Bartlett, Victor Gabillon, Jennifer Healey, Michal Valko
article: International Conference on Machine Learning, 2019, Long Beach, United States
Accès au texte intégral et bibtex

titre: Planning in entropy-regularized Markov decision processes and games
auteur: Jean-Bastien Grill, Omar D Domingues, Pierre Ménard, Rémi Munos, Michal Valko
article: Neural Information Processing Systems, 2019, Vancouver, Canada
Accès au texte intégral et bibtex

titre: Active multiple matrix completion with adaptive confidence sets
auteur: Andrea Locatelli, Alexandra Carpentier, Michal Valko
article: International Conference on Artificial Intelligence and Statistics, 2019, Okinawa, Japan
Accès au texte intégral et bibtex

titre: Gaussian process optimization with adaptive sketching: Scalable and no regret
auteur: Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco
article: Conference on Learning Theory, 2019, Phoenix, United States
Accès au texte intégral et bibtex

titre: Exploiting structure of uncertainty for efficient matroid semi-bandits
auteur: Pierre Perrault, Vianney Perchet, Michal Valko
article: International Conference on Machine Learning, 2019, Long Beach, United States
Accès au texte intégral et bibtex

titre: A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption
auteur: Peter Bartlett, Victor Gabillon, Michal Valko
article: Algorithmic Learning Theory, 2019, Chicago, United States
Accès au texte intégral et bibtex

titre: Sequential change-point detection: Laplace concentration of scan statistics and non-asymptotic delay bounds
auteur: Odalric-Ambrym Maillard
article: Algorithmic Learning Theory, 2019, Chicago, United States. pp.1 – 23
Accès au texte intégral et bibtex

titre: Rotting bandits are not harder than stochastic ones
auteur: Julien Seznec, Andrea Locatelli, Alexandra Carpentier, Alessandro Lazaric, Michal Valko
article: International Conference on Artificial Intelligence and Statistics, 2019, Naha, Japan
Accès au texte intégral et bibtex

titre: General parallel optimization without a metric
auteur: Xuedong Shang, Emilie Kaufmann, Michal Valko
article: Algorithmic Learning Theory, 2019, Chicago, United States
Accès au texte intégral et bibtex

Theses

titre: Tree search applied to optimization and planning
auteur: Jean-Bastien Grill
article: Computer Science [cs]. Lille 1, 2019. English. ⟨NNT : ⟩
Accès au bibtex

titre: Reinforcement learning for Dialogue Systems optimization with user adaptation.
auteur: Nicolas Carrara
article: Artificial Intelligence [cs.AI]. Ecole Doctoral Science pour l’Ingénieur Université Lille Nord-de-France, 2019. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Exploration-exploitation dilemma in Reinforcement Learning under various form of prior knowledge
auteur: Ronan Fruit
article: Artificial Intelligence [cs.AI]. Université de Lille 1, Sciences et Technologies; CRIStAL UMR 9189, 2019. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: Non-Asymptotic Pure Exploration by Solving Games
auteur: Rémy Degenne, Wouter M. Koolen, Pierre Ménard
article: 2019
Accès au texte intégral et bibtex

titre: High-Dimensional Control Using Generalized Auxiliary Tasks
auteur: Yannis Flet-Berliac, Philippe Preux
article: 2019
Accès au texte intégral et bibtex

titre: Self-Educated Language Agent With Hindsight Experience Replay For Instruction Following
auteur: Geoffrey Cideron, Mathieu Seurin, Florian Strub, Olivier Pietquin
article: 2019
Accès au bibtex

titre: Social Attention for Autonomous Decision-Making in Dense Traffic
auteur: Edouard Leurent, Jean Mercat
article: 2019
Accès au texte intégral et bibtex

titre: Active Roll-outs in MDP with Irreversible Dynamics
auteur: Odalric-Ambrym Maillard, Timothy Mann, Ronald Ortner, Shie Mannor
article: 2019
Accès au texte intégral et bibtex

titre: Accurate reconstruction of EBSD datasets by a multimodal data approach using an evolutionary algorithm
auteur: Florian Strub, Marie-Agathe Charpagne, Tresa M. Pollock
article: 2019
Accès au bibtex

2018

Journal articles

titre: A Spectral Algorithm with Additive Clustering for the Recovery of Overlapping Communities in Networks
auteur: Emilie Kaufmann, Thomas Bonald, Marc Lelarge
article: Theoretical Computer Science, 2018, 742, pp.3-26. ⟨10.1016/j.tcs.2017.12.028⟩
Accès au texte intégral et bibtex

titre: Correctness Attraction: A Study of Stability of Software Behavior Under Runtime Perturbation
auteur: Benjamin Danglot, Philippe Preux, Benoit Baudry, Martin Monperrus
article: Empirical Software Engineering, 2018, 23 (4), pp.2086-2119. ⟨10.1007/s10664-017-9571-8⟩
Accès au texte intégral et bibtex

titre: Feature-wise transformations
auteur: Vincent Dumoulin, Ethan Perez, Harm Vries, Florian Strub, Nathan Schucher, Aaron Courville, Yoshua Bengio
article: Distill, 2018, 3 (7), ⟨10.23915/distill.00011⟩
Accès au bibtex

titre: Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs
auteur: Mohammad Sadegh Talebi, Odalric-Ambrym Maillard
article: Journal of Machine Learning Research, 2018, pp.1-36
Accès au texte intégral et bibtex

titre: On Bayesian index policies for sequential resource allocation
auteur: Emilie Kaufmann
article: Annals of Statistics, 2018, 46 (2), pp.842-865. ⟨10.1214/17-AOS1569⟩
Accès au texte intégral et bibtex

titre: Streaming kernel regression with provably adaptive mean, variance, and regularization
auteur: Audrey Durand, Odalric-Ambrym Maillard, Joelle Pineau
article: Journal of Machine Learning Research, 2018, 1, pp.1 – 48
Accès au texte intégral et bibtex

titre: Boundary Crossing Probabilities for General Exponential Families
auteur: Odalric-Ambrym Maillard
article: Mathematical Methods of Statistics, 2018, 27, pp.1-31. ⟨10.3103/S1066530718010015⟩
Accès au texte intégral et bibtex

Conference papers

titre: Approximate Robust Control of Uncertain Dynamical Systems
auteur: Edouard Leurent, Yann Blanco, Denis Efimov, Odalric-Ambrym Maillard
article: Proc. MLITS Workshop at NeurIPS, Dec 2018, Montreal, Canada
Accès au texte intégral et bibtex

titre: Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling
auteur: Emilie Kaufmann, Wouter Koolen, Aurélien Garivier
article: Advances in Neural Information Processing Systems (NIPS), Dec 2018, Montréal, Canada
Accès au texte intégral et bibtex

titre: Fighting Boredom in Recommender Systems with Linear Reinforcement Learning
auteur: Romain Warlop, Alessandro Lazaric, Jérémie Mary
article: Neural Information Processing Systems, Dec 2018, Montreal, Canada. ⟨10.5555/3326943.3327105⟩
Accès au texte intégral et bibtex

titre: Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes
auteur: Ronan Fruit, Matteo Pirotta, Alessandro Lazaric
article: 32nd Conference on Neural Information Processing Systems, Dec 2018, Montréal, Canada
Accès au texte intégral et bibtex

titre: Optimistic optimization of a Brownian
auteur: Jean-Bastien Grill, Michal Valko, Rémi Munos
article: NeurIPS 2018 – Thirty-second Conference on Neural Information Processing Systems, Dec 2018, Montréal, Canada
Accès au texte intégral et bibtex

titre: Safe transfer learning for dialogue applications
auteur: Nicolas Carrara, Romain Laroche, Jean-Léon Bouraoui, Tanguy Urvoy, Olivier Pietquin
article: SLSP 2018 – 6th International Conference on Statistical Language and Speech Processing, Oct 2018, Mons, Belgium
Accès au texte intégral et bibtex

titre: A Fitted-Q Algorithm for Budgeted MDPs
auteur: Nicolas Carrara, Romain Laroche, Jean-Léon Bouraoui, Tanguy Urvoy, Olivier Pietquin
article: EWRL 2018 – 14th European workshop on Reinforcement Learning, Oct 2018, Lille, France
Accès au texte intégral et bibtex

titre: Adaptive black-box optimization got easier: HCT only needs local smoothness
auteur: Xuedong Shang, Emilie Kaufmann, Michal Valko
article: European Workshop on Reinforcement Learning, Oct 2018, Lille, France
Accès au texte intégral et bibtex

titre: Compressing the Input for CNNs with the First-Order Scattering Transform
auteur: Edouard Oyallon, Eugene Belilovsky, Sergey Zagoruyko, Michal Valko
article: ECCV 2018 – European Conference on Computer Vision, Sep 2018, Munich, Germany
Accès au texte intégral et bibtex

titre: Visual Reasoning with Multi-hop Feature Modulation
auteur: Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jérémie Mary, Philippe Preux, Aaron Courville, Olivier Pietquin
article: ECCV 2018 – 15th European Conference on Computer Vision, Sep 2018, Munich, Germany. pp.808-831
Accès au texte intégral et bibtex

titre: Importance Weighted Transfer of Samples in Reinforcement Learning
auteur: Andrea Tirinzoni, Andrea Sessa, Matteo Pirotta, Marcello Restelli
article: ICML 2018 – The 35th International Conference on Machine Learning, Jul 2018, Stockholm, Sweden. pp.4936-4945
Accès au texte intégral et bibtex

titre: Improved large-scale graph learning through ridge spectral sparsification
auteur: Daniele Calandriello, Ioannis Koutis, Alessandro Lazaric, Michal Valko
article: International Conference on Machine Learning, Jul 2018, Stockholm, Sweden
Accès au texte intégral et bibtex

titre: Training Dialogue Systems With Human Advice
auteur: Merwan Barlier, Romain Laroche, Olivier Pietquin
article: AAMAS 2018 – the 17th International Conference on Autonomous Agents and Multiagent Systems, Jul 2018, Stockholm, Sweden. pp.9
Accès au texte intégral et bibtex

titre: Stochastic Variance-Reduced Policy Gradient
auteur: Matteo Papini, Damiano Binaghi, Giuseppe Canonaco, Matteo Pirotta, Marcello Restelli
article: ICML 2018 – 35th International Conference on Machine Learning, Jul 2018, Stockholm, Sweden. pp.4026-4035
Accès au texte intégral et bibtex

titre: Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning
auteur: Ronan Fruit, Matteo Pirotta, Alessandro Lazaric, Ronald Ortner
article: ICML 2018 – The 35th International Conference on Machine Learning, Jul 2018, Stockholm, Sweden. pp.1578-1586
Accès au texte intégral et bibtex

titre: i-RevNet: Deep Invertible Networks
auteur: Jörn-Henrik Jacobsen, Arnold Smeulders, Edouard Oyallon
article: ICLR 2018 – International Conference on Learning Representations, Apr 2018, Vancouver, Canada
Accès au texte intégral et bibtex

titre: End-to-End Automatic Speech Translation of Audiobooks
auteur: Alexandre Bérard, Laurent Besacier, Ali Can Kocabiyikoglu, Olivier Pietquin
article: ICASSP 2018 – IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2018, Calgary, Alberta, Canada
Accès au texte intégral et bibtex

titre: Aggregation of Multi-Armed Bandits Learning Algorithms for Opportunistic Spectrum Access
auteur: Lilian Besson, Emilie Kaufmann, Christophe Moy
article: IEEE WCNC – IEEE Wireless Communications and Networking Conference, Apr 2018, Barcelona, Spain. ⟨10.1109/wcnc.2018.8377070⟩
Accès au texte intégral et bibtex

titre: Actor-Critic Fictitious Play in Simultaneous Move Multistage Games
auteur: Julien Pérolat, Bilal Piot, Olivier Pietquin
article: AISTATS 2018 – 21st International Conference on Artificial Intelligence and Statistics, Apr 2018, Playa Blanca, Lanzarote, Canary Islands, Spain
Accès au texte intégral et bibtex

titre: Corrupt Bandits for Preserving Local Privacy
auteur: Pratik Gajane, Tanguy Urvoy, Emilie Kaufmann
article: ALT 2018 – Algorithmic Learning Theory, Apr 2018, Lanzarote, Spain
Accès au texte intégral et bibtex

titre: Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence
auteur: Maryam Aziz, Jesse Anderton, Emilie Kaufmann, Javed Aslam
article: ALT 2018 – Algorithmic Learning Theory, Apr 2018, Lanzarote, Spain
Accès au texte intégral et bibtex

titre: Multi-Player Bandits Revisited
auteur: Lilian Besson, Emilie Kaufmann
article: Algorithmic Learning Theory, Mehryar Mohri; Karthik Sridharan, Apr 2018, Lanzarote, Spain
Accès au texte intégral et bibtex

titre: FiLM: Visual Reasoning with a General Conditioning Layer
auteur: Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, Aaron Courville
article: AAAI Conference on Artificial Intelligence, Feb 2018, New Orleans, United States
Accès au bibtex

titre: Best of both worlds: Stochastic & adversarial best-arm identification
auteur: Yasin Abbasi-Yadkori, Peter Bartlett, Victor Gabillon, Alan Malek, Michal Valko
article: Conference on Learning Theory, 2018, Stockholm, Sweden
Accès au texte intégral et bibtex

Other publications

titre: A Fitted-Q Algorithm for Budgeted MDPs
auteur: Nicolas Carrara, Romain Laroche, Jean-Léon Bouraoui, Tanguy Urvoy, Olivier Pietquin
article: 2018
Accès au texte intégral et bibtex

Poster communications

titre: Memory Bandits: Towards the Switching Bandit Problem Best Resolution
auteur: Réda Alami, Odalric-Ambrym Maillard, Raphaël Féraud
article: MLSS 2018 – Machine Learning Summer School, Aug 2018, Madrid, Spain
Accès au texte intégral et bibtex

titre: Multi-Armed bandit Learning in Iot Networks (MALIN)
auteur: Remi Bonnefoi, Lilian Besson, Christophe Moy
article: ICT 2018 – 25th International Conference on Telecommunications, Jun 2018, Saint-Malo, France.
Accès au texte intégral et bibtex

titre: Multi-Player Bandits Revisited
auteur: Lilian Besson
article: Séminaire « IETR : Interagir Evaluer Transmettre Réunir », Jun 2018, Vannes, France
Accès au texte intégral et bibtex

Notes de synthèse

titre: A Note on the Ei Function and a Useful Sum-Inequality
auteur: Lilian Besson
article: 2018
Accès au texte intégral et bibtex

Theses

titre: On the role of the human being in human/machine dialogue
auteur: Merwan Barlier
article: Informatique [cs]. Université de lille, 2018. Français. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Novel Learning and Exploration-Exploitation Methods for Effective Recommender Systems
auteur: Romain Warlop
article: Artificial Intelligence [cs.AI]. Lille1, 2018. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Neural Machine Translation Architectures and Applications
auteur: Alexandre Bérard
article: Computer Science [cs]. Université de lille, 2018. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: Calibrated Fairness in Bandits
auteur: Yang Liu, Goran Radanovic, Christos Dimitrakakis, Debmalya Mandal, David C Parkes
article: 2018
Accès au texte intégral et bibtex

titre: Deep Reinforcement Learning and the Deadly Triad
auteur: Hado van Hasselt, Yotam Doron, Florian Strub, Matteo Hessel, Nicolas Sonnerat, Joseph Modayil
article: 2018
Accès au bibtex

titre: Upper Confidence Reinforcement Learning exploiting state-action equivalence
auteur: Odalric-Ambrym Maillard, Mahsa Asadi
article: 2018
Accès au texte intégral et bibtex

titre: A Survey of State-Action Representations for Autonomous Driving
auteur: Edouard Leurent
article: 2018
Accès au texte intégral et bibtex

titre: Recurrent Neural Networks for Long and Short-Term Sequential Recommendation
auteur: Kiewan Villatel, Elena Smirnova, Jérémie Mary, Philippe Preux
article: 2018
Accès au texte intégral et bibtex

titre: SMPyBandits: an Experimental Framework for Single and Multi-Players Multi-Arms Bandits Algorithms in Python
auteur: Lilian Besson
article: 2018
Accès au texte intégral et bibtex

titre: What Doubling Tricks Can and Can’t Do for Multi-Armed Bandits
auteur: Lilian Besson, Emilie Kaufmann
article: 2018
Accès au texte intégral et bibtex

2017

Journal articles

titre: DUCT: An Upper Confidence Bound Approach to Distributed Constraint Optimisation Problems *
auteur: Brammert Ottens, Christos Dimitrakakis, Boi Faltings
article: ACM Transactions on Intelligent Systems and Technology, 2017, 8 (5), pp.1 – 27. ⟨10.1145/3066156⟩
Accès au texte intégral et bibtex

titre: A Large-scale Study of Call Graph-based Impact Prediction using Mutation Testing
auteur: Vincenzo Musco, Martin Monperrus, Philippe Preux
article: Software Quality Journal, 2017, 25 (3), pp.921-950. ⟨10.1007/s11219-016-9332-8⟩
Accès au texte intégral et bibtex

titre: Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning
auteur: Bilal Piot, Matthieu Geist, Olivier Pietquin
article: IEEE Transactions on Neural Networks and Learning Systems, 2017, 28 (8), pp.1814 – 1826. ⟨10.1109/TNNLS.2016.2543000⟩
Accès au texte intégral et bibtex

titre: Differential Privacy for Bayesian Inference through Posterior Sampling
auteur: Christos Dimitrakakis, Blaine Nelson, Zuhe Zhang, Aikateirni Mitrokotsa, Benjamin I P Rubinstein
article: Journal of Machine Learning Research, 2017, 18 (11), pp.1−39
Accès au texte intégral et bibtex

titre: Learning the distribution with largest mean: two bandit frameworks
auteur: Emilie Kaufmann, Aurélien Garivier
article: ESAIM: Proceedings and Surveys, 2017, 60, pp.114 – 131. ⟨10.1051/proc/201760114⟩
Accès au texte intégral et bibtex

Conference papers

titre: Unwritten Languages Demand Attention Too! Word Discovery with Encoder-Decoder Models
auteur: Marcely Zanon Boito, Alexandre Bérard, Aline Villavicencio, Laurent Besacier
article: IEEE Automatic Speech Recognition and Understanding (ASRU), Dec 2017, Okinawa, Japan
Accès au texte intégral et bibtex

titre: HoME: a Household Multimodal Environment
auteur: Simon Brodeur, Ethan Perez, Ankesh Anand, Florian Golemo, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, Aaron Courville
article: NIPS 2017’s Visually-Grounded Interaction and Language Workshop, Dec 2017, Long Beach, United States
Accès au bibtex

titre: Modulating early visual processing by language
auteur: Harm de Vries, Florian Strub, Jérémie Mary, Hugo Larochelle, Olivier Pietquin, Aaron Courville
article: NIPS 2017 – Conference on Neural Information Processing Systems, Dec 2017, Long Beach, United States. pp.1-14
Accès au texte intégral et bibtex

titre: Is the Bellman residual a bad proxy?
auteur: Matthieu Geist, Bilal Piot, Olivier Pietquin
article: NIPS 2017 – Advances in Neural Information Processing Systems, Dec 2017, Long Beach, United States. pp.1-13
Accès au texte intégral et bibtex

titre: Regret Minimization in MDPs with Options without Prior Knowledge
auteur: Ronan Fruit, Matteo Pirotta, Alessandro Lazaric, Emma Brunskill
article: NIPS 2017 – Neural Information Processing Systems, Dec 2017, Long Beach, United States. pp.1-36
Accès au texte intégral et bibtex

titre: Memory Bandits: a Bayesian approach for the Switching Bandit Problem
auteur: Réda Alami, Odalric Maillard, Raphael Féraud
article: NIPS 2017 – 31st Conference on Neural Information Processing Systems, Dec 2017, Long Beach, United States
Accès au texte intégral et bibtex

titre: Compatible Reward Inverse Reinforcement Learning
auteur: Alberto Maria Metelli, Matteo Pirotta, Marcello Restelli
article: The Thirty-first Annual Conference on Neural Information Processing Systems – NIPS 2017, Dec 2017, Long Beach, United States
Accès au texte intégral et bibtex

titre: Adaptive Batch Size for Safe Policy Gradients
auteur: Matteo Papini, Matteo Pirotta, Marcello Restelli
article: The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS), Dec 2017, Long Beach, United States
Accès au texte intégral et bibtex

titre: Online influence maximization under independent cascade model with semi-bandit feedback
auteur: Zheng Wen, Branislav Kveton, Michal Valko, Sharan Vaswani
article: Neural Information Processing Systems, Dec 2017, Long Beach, United States. pp.1-24
Accès au texte intégral et bibtex

titre: Independence clustering (without a matrix)
auteur: Daniil Ryabko
article: NIPS 2017 – Thirty-first Annual Conference on Neural Information Processing Systems, Dec 2017, Long Beach, United States. pp.1-14
Accès au bibtex

titre: Monte-Carlo Tree Search by Best Arm Identification
auteur: Emilie Kaufmann, Wouter M. Koolen
article: NIPS 2017 – 31st Annual Conference on Neural Information Processing Systems, Dec 2017, Long Beach, United States. pp.1-23
Accès au texte intégral et bibtex

titre: A generative model for sparse, evolving digraphs
auteur: Georgios Papoudakis, Philippe Preux, Martin Monperrus
article: 6th International Conference on Complex Networks and their Applications, Nov 2017, Lyon, France. pp.531-542, ⟨10.1007/978-3-319-72150-7_43⟩
Accès au texte intégral et bibtex

titre: Universality of Bayesian mixture predictors
auteur: Daniil Ryabko
article: ALT 2017 – 28th International Conference on Algorithmic Learning Theory, Oct 2017, Kyoto, Japan. pp.1-13
Accès au bibtex

titre: Hypotheses testing on infinite random graphs
auteur: Daniil Ryabko
article: ALT 2017 – 28th International Conference on Algorithmic Learning Theory, Oct 2017, kyoto, Japan. pp.1-12
Accès au bibtex

titre: Efficient tracking of a growing number of experts
auteur: Jaouad Mourtada, Odalric-Ambrym Maillard
article: Algorithmic Learning Theory, Oct 2017, Tokyo, Japan. pp.1 – 23
Accès au texte intégral et bibtex

titre: Boundary Crossing for General Exponential Families
auteur: Odalric-Ambrym Maillard
article: Algorithmic Learning Theory, Oct 2017, Kyoto, Japan. pp.1 – 34
Accès au texte intégral et bibtex

titre: Multi-Armed Bandit Learning in IoT Networks: Learning helps even in non-stationary settings
auteur: Rémi Bonnefoi, Lilian Besson, Christophe Moy, Emilie Kaufmann, Jacques Palicot
article: CROWNCOM 2017 – 12th EAI International Conference on Cognitive Radio Oriented Wireless Networks, Sep 2017, Lisbon, Portugal. pp.173-185, ⟨10.1007/978-3-319-76207-4_15⟩
Accès au texte intégral et bibtex

titre: Bayesian Inference for Least Squares Temporal Difference Regularization
auteur: Nikolaos Tziortziotis, Christos Dimitrakakis
article: ECML 2017 – European Conference on Machine Learning, 2017-09-22, Sep 2017, Skopje, Macedonia
Accès au texte intégral et bibtex

titre: LIG-CRIStAL System for the WMT17 Automatic Post-Editing Task
auteur: Alexandre Bérard, Olivier Pietquin, Laurent Besacier
article: Second conference on machine translation (WMT17) during EMNLP 2017, Sep 2017, Copenhague, Denmark
Accès au texte intégral et bibtex

titre: End-to-end optimization of goal-driven and visually grounded dialogue systems Harm de Vries
auteur: Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin
article: International Joint Conference on Artificial Intelligence, Aug 2017, Melbourne, Australia
Accès au texte intégral et bibtex

titre: Online learning and transfer for user adaptation in dialogue systems
auteur: Nicolas Carrara, Romain Laroche, Olivier Pietquin
article: SIGDIAL/SEMDIAL joint special session on negotiation dialog 2017, Aug 2017, Saarbrücken, Germany
Accès au texte intégral et bibtex

titre: Learning Visual Reasoning Without Strong Priors
auteur: Ethan Perez, Harm de Vries, Florian Strub, Vincent Dumoulin, Aaron Courville
article: ICML 2017’s Machine Learning in Speech and Language Processing Workshop, Aug 2017, Sidney, France
Accès au bibtex

titre: Active Learning for Accurate Estimation of Linear Models
auteur: Carlos Riquelme, Mohammad Ghavamzadeh, Alessandro Lazaric
article: ICML 2017 – 34th International Conference on Machine Learning, Aug 2017, Sydney, Australia. pp.36
Accès au texte intégral et bibtex

titre: Boosted Fitted Q-Iteration
auteur: Samuele Tosatto, Matteo Pirotta, Carlo d’Eramo, Marcello Restelli
article: 34th International Conference on Machine Learning (ICML), Aug 2017, Sydney, Australia
Accès au texte intégral et bibtex

titre: GuessWhat?! Visual object discovery through multi-modal dialogue
auteur: Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, Aaron Courville
article: Conference on Computer Vision and Pattern Recognition, Jul 2017, Honolulu, United States
Accès au texte intégral et bibtex

titre: A Multi-Armed Bandit Model Selection for Cold-Start User Recommendation
auteur: Crícia Z Felício, Klérisson V R Paixão, Celia a Z Barcelos, Philippe Preux
article: 25th ACM Conference on User Modelling, Adaptation and Personalization (UMAP), Jul 2017, Bratislava, Slovakia
Accès au texte intégral et bibtex

titre: Faut-il minimiser le résidu de Bellman ou maximiser la valeur moyenne ?
auteur: Matthieu Geist, Bilal Piot, Olivier Pietquin
article: Journées Francophones sur la Planification, la Décision et l’Apprentissage pour la conduite de systèmes (JFPDA 2017), Jul 2017, Caen, France
Accès au bibtex

titre: Spectral Learning from a Single Trajectory under Finite-State Policies
auteur: Borja Balle, Odalric-Ambrym Maillard
article: International conference on Machine Learning, Jul 2017, Sidney, France
Accès au texte intégral et bibtex

titre: Thompson Sampling for Linear-Quadratic Control Problems
auteur: Marc Abeille, Alessandro Lazaric
article: AISTATS 2017 – 20th International Conference on Artificial Intelligence and Statistics, Apr 2017, Fort Lauderdale, United States
Accès au texte intégral et bibtex

titre: Learning Nash Equilibrium for General-Sum Markov Games from Batch Data
auteur: Julien Pérolat, Florian Strub, Bilal Piot, Olivier Pietquin
article: AISTATS 2017 – The 20th International Conference on Artificial Intelligence and Statistics, Apr 2017, Fort Lauderdale, United States. pp.1-14
Accès au texte intégral et bibtex

titre: Linear Thompson Sampling Revisited
auteur: Marc Abeille, Alessandro Lazaric
article: AISTATS 2017 – 20th International Conference on Artificial Intelligence and Statistics, Apr 2017, Fort Lauderdale, United States
Accès au texte intégral et bibtex

titre: Exploration–Exploitation in MDPs with Options
auteur: Ronan Fruit, Alessandro Lazaric
article: AISTATS 2017 – 20th International Conference on Artificial Intelligence and Statistics, Apr 2017, Fort Lauderdale, United States
Accès au texte intégral et bibtex

titre: Parallel Higher Order Alternating Least Square for Tensor Recommender System
auteur: Romain Warlop, Alessandro Lazaric, Jérémie Mary
article: AAAI 2017 – Thirty-First AAAI Conference on Artificial Intelligence, Feb 2017, San Francisco, United States
Accès au bibtex

titre: Transfer Reinforcement Learning with Shared Dynamics
auteur: Romain Laroche, Merwan Barlier
article: AAAI-17 – Thirty-First AAAI Conference on Artificial Intelligence, Feb 2017, San Francisco, United States. pp.7
Accès au texte intégral et bibtex

titre: Efficient second-order online kernel learning with adaptive embedding
auteur: Daniele Calandriello, Alessandro Lazaric, Michal Valko
article: Neural Information Processing Systems, 2017, Long Beach, United States
Accès au texte intégral et bibtex

titre: Second-Order Kernel Online Convex Optimization with Adaptive Sketching
auteur: Daniele Calandriello, Alessandro Lazaric, Michal Valko
article: International Conference on Machine Learning, 2017, Sydney, Australia
Accès au texte intégral et bibtex

titre: Zonotope hit-and-run for efficient sampling from projection DPPs
auteur: Guillaume Gautier, Rémi Bardenet, Michal Valko
article: International Conference on Machine Learning, 2017, Sydney, Australia
Accès au texte intégral et bibtex

titre: Distributed adaptive sampling for kernel matrix approximation
auteur: Daniele Calandriello, Alessandro Lazaric, Michal Valko
article: International Conference on Artificial Intelligence and Statistics, 2017, Fort Lauderdale, United States
Accès au texte intégral et bibtex

titre: Trading off rewards and errors in multi-armed bandits
auteur: Akram Erraqabi, Alessandro Lazaric, Michal Valko, Emma Brunskill, Yun-En Liu
article: International Conference on Artificial Intelligence and Statistics, 2017, Fort Lauderdale, United States
Accès au texte intégral et bibtex

Lectures

titre: Basic Concentration Properties of Real-Valued Distributions
auteur: Odalric-Ambrym Maillard
article: Doctoral. France. 2017
Accès au texte intégral et bibtex

Poster communications

titre: Multi-Armed Bandit Learning in IoT Networks
auteur: Remi Bonnefoi, Lilian Besson
article: Journée des Doctorants de l’IETR, Jul 2017, Rennes, France
Accès au texte intégral et bibtex

Theses

titre: Efficient Sequential Learning in Structured and Constrained Environments
auteur: Daniele Calandriello
article: Machine Learning [cs.LG]. Inria Lille Nord Europe – Laboratoire CRIStAL – Université de Lille, 2017. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Reinforcement Learning: The Multi-Player Case
auteur: Julien Pérolat
article: Artificial Intelligence [cs.AI]. Université de Lille 1 – Sciences et Technologies, 2017. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Exploration-Exploitation with Thompson Sampling in Linear Systems
auteur: Marc Abeille
article: Mathematics [math]. Université de Lille 1, 2017. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Bandits Multi-bras avec retour d’information non-conventionnelle
auteur: Pratik Gajane
article: Artificial Intelligence [cs.AI]. Université de Lille, 2017. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: Difference of Convex Functions Programming Applied to Control with Expert Data
auteur: Bilal Piot, Matthieu Geist, Olivier Pietquin
article: 2017
Accès au texte intégral et bibtex

titre: Subjective Fairness
auteur: Christos Dimitrakakis, Yang Liu, David Parkes, Goran Radanovic
article: 2017
Accès au texte intégral et bibtex

titre: Multi-view Sequential Games: The Helper-Agent Problem
auteur: Christos Dimitrakakis, Firas Jarboui, David Parkes, Lior Seeman
article: 2017
Accès au texte intégral et bibtex

2016

Journal articles

titre: Exploiting Social Information in Pairwise Preference Recommender System
auteur: Crícia Z Felício, Klérisson V R Paixão, Guilherme Alves, Sandra de Amo, Philippe Preux
article: Journal of Information and Data Management, 2016, 7 (2), pp.16
Accès au texte intégral et bibtex

titre: Importance Weighting Without Importance Weights: An Efficient Algorithm for Combinatorial Semi-Bandits
auteur: Gergely Neu, Bartók Gábor
article: Journal of Machine Learning Research, 2016, 17 (154), pp.1 – 21
Accès au texte intégral et bibtex

titre: Bayesian Policy Gradient and Actor-Critic Algorithms
auteur: Mohammad Ghavamzadeh, Yaakov Engel, Michal Valko
article: Journal of Machine Learning Research, 2016, 17 (66), pp.1-53
Accès au texte intégral et bibtex

titre: On the Complexity of Best Arm Identification in Multi-Armed Bandit Models
auteur: Emilie Kaufmann, Olivier Cappé, Aurélien Garivier
article: Journal of Machine Learning Research, 2016, 17, pp.1-42
Accès au texte intégral et bibtex

titre: Consistent Algorithms for Clustering Time Series
auteur: Azadeh Khaleghi, Daniil Ryabko, Jérémie Mary, Philippe Preux
article: Journal of Machine Learning Research, 2016, 17 (3), pp.1 – 32
Accès au texte intégral et bibtex

titre: Nonparametric multiple change point estimation in highly dependent time series
auteur: Azadeh Khaleghi, Daniil Ryabko
article: Theoretical Computer Science, 2016, 620, pp.119-133. ⟨10.1016/j.tcs.2015.10.041⟩
Accès au bibtex

titre: Operator-valued Kernels for Learning from Functional Response Data
auteur: Hachem Kadri, Emmanuel Duflos, Philippe Preux, Stéphane Canu, Alain Rakotomamonjy, Julien Audiffren
article: Journal of Machine Learning Research, 2016, 17 (20), pp.1-54
Accès au texte intégral et bibtex

titre: Analysis of Classification-based Policy Iteration Algorithms
auteur: Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos
article: Journal of Machine Learning Research, 2016, 17, pp.1 – 30
Accès au texte intégral et bibtex

Conference papers

titre: Learning Dialogue Dynamics with the Method of Moments
auteur: Merwan Barlier, Romain Laroche, Olivier Pietquin
article: Workshop on Spoken Language Technologie (SLT 2016), Dec 2016, San Diego, United States
Accès au texte intégral et bibtex

titre: Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources?
auteur: Christophe Servan, Alexandre Bérard, Zied Elloumi, Hervé Blanchon, Laurent Besacier
article: COLING 2016, ANLP & ICCL, Dec 2016, Osaka, Japan
Accès au texte intégral et bibtex

titre: Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation
auteur: Alexandre Bérard, Olivier Pietquin, Laurent Besacier, Christophe Servan
article: NIPS Workshop on end-to-end learning for speech and audio processing, Dec 2016, Barcelona, Spain
Accès au texte intégral et bibtex

titre: Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning
auteur: Jean-Bastien Grill, Michal Valko, Rémi Munos
article: Neural Information Processing Systems, Dec 2016, Barcelona, Spain
Accès au texte intégral et bibtex

titre: On Explore-Then-Commit Strategies
auteur: Aurélien Garivier, Emilie Kaufmann, Tor Lattimore
article: NIPS, Dec 2016, Barcelona, Spain
Accès au texte intégral et bibtex

titre: Preference-like Score to Cope with Cold-Start User in Recommender Systems
auteur: Crícia Z Felício, Klérisson V R Paixão, Celia a Z Barcelos, Philippe Preux
article: 28th International Conference on Tools with Artificial Intelligence (ICTAI), Nov 2016, San Jose, United States
Accès au texte intégral et bibtex

titre: Sequential Collaborative Ranking Using (No-)Click Implicit Feedback
auteur: Frédéric Guillou, Romaric Gaudel, Philippe Preux
article: The 23rd International Conference on Neural Information Processing (ICONIP’16), Oct 2016, Kyoto, Japan. pp.288 – 296, ⟨10.1007/978-3-319-46672-9_33⟩
Accès au texte intégral et bibtex

titre: Mutation-Based Graph Inference for Fault Localization
auteur: Vincenzo Musco, Martin Monperrus, Philippe Preux
article: International Working Conference on Source Code Analysis and Manipulation, Oct 2016, Raleigh, United States. ⟨10.1109/SCAM.2016.24⟩
Accès au texte intégral et bibtex

titre: Things Bayes can’t do
auteur: Daniil Ryabko
article: Proceedings of the 27th International Conference on Algorithmic Learning Theory (ALT’16), Oct 2016, Bari, Italy. pp.253-260, ⟨10.1007/978-3-319-46379-7_17⟩
Accès au bibtex

titre: Hybrid Recommender System based on Autoencoders
auteur: Florian Strub, Romaric Gaudel, Jérémie Mary
article: the 1st Workshop on Deep Learning for Recommender Systems, Sep 2016, Boston, United States. pp.11 – 16, ⟨10.1145/2988450.2988456⟩
Accès au texte intégral et bibtex

titre: Large-scale Bandit Recommender System
auteur: Frédéric Guillou, Romaric Gaudel, Philippe Preux
article: Proc. of the Second International Workshop on Machine Learning, Optimization and Big Data (MOD), Sep 2016, Volterra, Italy. pp.11, ⟨10.1007/978-3-319-51469-7_17⟩
Accès au texte intégral et bibtex

titre: A Stochastic Model for Computer-Aided Human-Human Dialogue
auteur: Merwan Barlier, Romain Laroche, Olivier Pietquin
article: Interspeech 2016, Sep 2016, San Francisco, United States. pp.2051 – 2055
Accès au texte intégral et bibtex

titre: Filtrage Collaboratif Hybride avec des Auto-encodeurs
auteur: Florian Strub, Jérémie Mary, Romaric Gaudel
article: Conférence francophone sur l’Apprentissage Automatique (CAp’16), Jul 2016, Marseille, France
Accès au bibtex

titre: Compromis exploration-exploitation pour système de recommandation à grande échelle
auteur: Frédéric Guillou, Romaric Gaudel, Philippe Preux
article: Conférence francophone sur l’Apprentissage Automatique (CAp’16), Jul 2016, Marseille, France
Accès au bibtex

titre: Online learning with Erdős-Rényi side-observation graphs
auteur: Tomáš Kocák, Gergely Neu, Michal Valko
article: Uncertainty in Artificial Intelligence, Jun 2016, New York City, United States
Accès au texte intégral et bibtex

titre: Analysis of Nyström method with sequential ridge leverage score sampling
auteur: Daniele Calandriello, Alessandro Lazaric, Michal Valko
article: Uncertainty in Artificial Intelligence, Jun 2016, New York City, United States
Accès au texte intégral et bibtex

titre: Maximin Action Identification: A New Bandit Framework for Games
auteur: Aurélien Garivier, Emilie Kaufmann, Wouter M. Koolen
article: 29th Annual Conference on Learning Theory (COLT), Jun 2016, New-York, United States
Accès au texte intégral et bibtex

titre: Optimal Best Arm Identification with Fixed Confidence
auteur: Aurélien Garivier, Emilie Kaufmann
article: 29th Annual Conference on Learning Theory (COLT), Jun 2016, New York, United States
Accès au texte intégral et bibtex

titre: Softened approximate policy iteration for Markov games
auteur: Julien Pérolat, Bilal Piot, Matthieu Geist, Bruno Scherrer, Olivier Pietquin
article: ICML 2016 – 33rd International Conference on Machine Learning, Jun 2016, New York City, United States
Accès au texte intégral et bibtex

titre: PAC learning of Probabilistic Automaton based on the Method of Moments
auteur: Hadrien Glaude, Olivier Pietquin
article: International Conference on Machine Learning (ICML 2016), Jun 2016, New York, United States
Accès au texte intégral et bibtex

titre: Pliable rejection sampling
auteur: Akram Erraqabi, Michal Valko, Alexandra Carpentier, Odalric-Ambrym Maillard
article: International Conference on Machine Learning, Jun 2016, New York City, United States
Accès au texte intégral et bibtex

titre: Reinforcement Learning of POMDPs using Spectral Methods
auteur: Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar
article: Proceedings of the 29th Annual Conference on Learning Theory (COLT2016), Jun 2016, New York City, United States
Accès au texte intégral et bibtex

titre: MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP
auteur: Alexandre Bérard, Christophe Servan, Olivier Pietquin, Laurent Besacier
article: The 10th edition of the Language Resources and Evaluation Conference (LREC), May 2016, Portoroz, Slovenia
Accès au texte intégral et bibtex

titre: A Learning Algorithm for Change Impact Prediction
auteur: Vincenzo Musco, Antonin Carette, Martin Monperrus, Philippe Preux
article: 5th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering, May 2016, Austin, United States. pp.8-14, ⟨10.1145/2896995.2896996⟩
Accès au texte intégral et bibtex

titre: Revealing graph bandits for maximizing local influence
auteur: Alexandra Carpentier, Michal Valko
article: International Conference on Artificial Intelligence and Statistics, May 2016, Seville, Spain
Accès au texte intégral et bibtex

titre: Online learning with noisy side observations
auteur: Tomáš Kocák, Gergely Neu, Michal Valko
article: International Conference on Artificial Intelligence and Statistics, May 2016, Seville, Spain
Accès au texte intégral et bibtex

titre: Score-based Inverse Reinforcement Learning
auteur: Layla El Asri, Bilal Piot, Matthieu Geist, Romain Laroche, Olivier Pietquin
article: International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016), May 2016, Singapore, Singapore
Accès au texte intégral et bibtex

titre: On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games
auteur: Julien Pérolat, Bilal Piot, Bruno Scherrer, Olivier Pietquin
article: 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016), May 2016, Cadiz, Spain
Accès au bibtex

titre: Improved Learning Complexity in Combinatorial Pure Exploration Bandits
auteur: Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, Ronald Ortner, Peter Bartlett
article: Proceedings of the 19th International Conference on Artificial Intelligence (AISTATS), May 2016, Cadiz, Spain
Accès au texte intégral et bibtex

titre: Algorithms for Differentially Private Multi-Armed Bandits
auteur: Aristide C. Y. Tossou, Christos Dimitrakakis
article: AAAI 2016, Feb 2016, Phoenix, Arizona, United States
Accès au texte intégral et bibtex

titre: On the Differential Privacy of Bayesian Inference
auteur: Zuhe Zhang, Benjamin Rubinstein, Christos Dimitrakakis
article: AAAI 2016 – Thirtieth AAAI Conference on Artificial Intelligence, Feb 2016, Phoenix, Arizona, United States
Accès au texte intégral et bibtex

titre: Compact and Interpretable Dialogue State Representation with Genetic Sparse Distributed Memory
auteur: Layla El Asri, Romain Laroche, Olivier Pietquin
article: 7th International Workshop on Spoken Dialogue Systems (IWSDS 2016), Jan 2016, Saariselka, Finland
Accès au bibtex

titre: Pack only the essentials: Adaptive dictionary learning for kernel ridge regression
auteur: Daniele Calandriello, Alessandro Lazaric, Michal Valko
article: Adaptive and Scalable Nonparametric Methods in Machine Learning at Neural Information Processing Systems, 2016, Barcelona, Spain
Accès au texte intégral et bibtex

titre: Scalable explore-exploit Collaborative Filtering
auteur: Frédéric Guillou, Romaric Gaudel, Philippe Preux
article: Pacific Asia Conference on Information Systems (PACIS’16), 2016, Chiayi, Taiwan
Accès au bibtex

titre: Rewards and errors in multi-arm bandits for interactive education
auteur: Akram Erraqabi, Alessandro Lazaric, Michal Valko, Emma Brunskill, Yun-En Liu
article: Challenges in Machine Learning: Gaming and Education workshop at Neural Information Processing Systems, 2016, Barcelona, Spain
Accès au texte intégral et bibtex

Habilitation à diriger des recherches

titre: Bandits on graphs and structures
auteur: Michal Valko
article: Machine Learning [stat.ML]. École normale supérieure de Cachan – ENS Cachan, 2016
Accès au texte intégral et bibtex

Theses

titre: On Recommendation Systems in a Sequential Context
auteur: Frédéric Guillou
article: Machine Learning [cs.LG]. Université Lille 3, 2016. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Sequential Learning with Similarities
auteur: Tomáš Kocák
article: Machine Learning [cs.LG]. Inria Lille Nord Europe – Laboratoire CRIStAL – Université de Lille, 2016. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Learning rational linear sequential systems using the method of moments
auteur: Hadrien Glaude
article: Apprentissage [cs.LG]. Université de Lille 1 – Sciences et Technologies, 2016. Français. ⟨NNT : ⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: Hybrid Collaborative Filtering with Autoencoders
auteur: Florian Strub, Jérémie Mary, Romaric Gaudel
article: 2016
Accès au texte intégral et bibtex

2015

Journal articles

titre: Truthful Learning Mechanisms for Multi–Slot Sponsored Search Auctions with Externalities
auteur: Nicola Gatti, Alessandro Lazaric, Marco Rocco, Francesco Trovò
article: Artificial Intelligence, 2015, 227, pp.93-139. ⟨10.1016/j.artint.2015.05.012⟩
Accès au texte intégral et bibtex

titre: Optimism in Active Learning
auteur: Timothé Collet, Olivier Pietquin
article: Computational Intelligence and Neuroscience, 2015, 2015, pp.1–17. ⟨10.1155/2015/973696⟩
Accès au bibtex

titre: Random-Walk Perturbations for Online Combinatorial Optimization
auteur: Luc Devroye, Gábor Lugosi, Gergely Neu
article: IEEE Transactions on Information Theory, 2015, 61 (7), pp.4099 – 4106. ⟨10.1109/TIT.2015.2428253⟩
Accès au texte intégral et bibtex

titre: Generalizing the Wilcoxon rank-sum test for interval data
auteur: Julien Perolat, Ines Couso, Kevin Loquin, Olivier Strauss
article: International Journal of Approximate Reasoning, 2015, 56, pp.108-121. ⟨10.1016/j.ijar.2014.08.001⟩
Accès au bibtex

titre: Approximate modified policy iteration and its application to the game of Tetris
auteur: Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, Boris Lesner, Matthieu Geist
article: Journal of Machine Learning Research, 2015, 16, pp.1629−1676
Accès au texte intégral et bibtex

Conference papers

titre: Spectral learning with proper probabilities for finite state automation
auteur: Hadrien Glaude, Cyrille Enderli, Olivier Pietquin
article: ASRU 2015 – Automatic Speech Recognition and Understanding Workshop, Dec 2015, Scottsdale, United States
Accès au texte intégral et bibtex

titre: Explore no more: Improved high-probability regret bounds for non-stochastic bandits
auteur: Gergely Neu
article: Advances on Neural Information Processing Systems 28 (NIPS 2015), Dec 2015, Montreal, Canada. pp.3150-3158
Accès au texte intégral et bibtex

titre: Collaborative Filtering as a Multi-Armed Bandit
auteur: Frédéric Guillou, Romaric Gaudel, Philippe Preux
article: NIPS’15 Workshop: Machine Learning for eCommerce, Dec 2015, Montréal, Canada
Accès au texte intégral et bibtex

titre: Bayesian Credible Intervals for Online and Active Learning of Classification Trees
auteur: Timothé Collet, Olivier Pietquin
article: ADPRL 2015 – Symposium on Adaptive Dynamic Programming and Reinforcement Learning., Dec 2015, Cape Town, South Africa
Accès au texte intégral et bibtex

titre: Collaborative Filtering with Stacked Denoising AutoEncoders and Sparse Inputs
auteur: Florian Strub, Jérémie Mary, Preux Philippe
article: NIPS Workshop on Machine Learning for eCommerce, Dec 2015, Montreal, Canada
Accès au texte intégral et bibtex

titre: Non-negative Spectral Learning for Linear Sequential Systems
auteur: Hadrien Glaude, Cyrille Enderli, Olivier Pietquin
article: 22nd International Conference on Neural Information Processing (ICONIP2015), Nov 2015, Istanbul, Turkey
Accès au texte intégral et bibtex

titre: Optimism in Active Learning with Gaussian Processes
auteur: Timothé Collet, Olivier Pietquin
article: 22nd International Conference on Neural Information Processing (ICONIP2015), Nov 2015, Istanbul, Turkey
Accès au texte intégral et bibtex

titre: Learning of scanning strategies for electronic support using predictive state representations
auteur: Hadrien Glaude, Cyrille Enderli, Jean-François Grandin, Olivier Pietquin
article: International Workshop on Machine Learning for Signal Processing (MLSP 2015), Sep 2015, Boston, United States
Accès au texte intégral et bibtex

titre: Human-Machine Dialogue as a Stochastic Game
auteur: Merwan Barlier, Julien Perolat, Romain Laroche, Olivier Pietquin
article: 16th Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL 2015), Sep 2015, Prague, Czech Republic
Accès au texte intégral et bibtex

titre: Inverse Reinforcement Learning in Relational Domains
auteur: Thibaut Munzer, Bilal Piot, Matthieu Geist, Olivier Pietquin, Manuel Lopes
article: International Joint Conferences on Artificial Intelligence, Jul 2015, Buenos Aires, Argentina
Accès au texte intégral et bibtex

titre: Direct Policy Iteration with Demonstrations
auteur: Jessica Chemali, Alessandro Lazaric
article: IJCAI – 24th International Joint Conference on Artificial Intelligence, Jul 2015, Buenos Aires, Argentina
Accès au texte intégral et bibtex

titre: Maximum Entropy Semi-Supervised Inverse Reinforcement Learning
auteur: Julien Audiffren, Michal Valko, Alessandro Lazaric, Mohammad Ghavamzadeh
article: International Joint Conference on Artificial Intelligence, Jul 2015, Bueons Aires, Argentina
Accès au texte intégral et bibtex

titre: Bandits and Recommender Systems
auteur: Jérémie Mary, Romaric Gaudel, Philippe Preux
article: First International Workshop on Machine Learning, Optimization, and Big Data (MOD’15), Jul 2015, Taormina, Italy. pp.325-336, ⟨10.1007/978-3-319-27926-8_29⟩
Accès au texte intégral et bibtex

titre: Large-scale semi-supervised learning with online spectral graph sparsification
auteur: Daniele Calandriello, Alessandro Lazaric, Michal Valko
article: Resource-Efficient Machine Learning workshop at International Conference on Machine Learning, Jul 2015, Lille, France
Accès au texte intégral et bibtex

titre: Imitation Learning Applied to Embodied Conversational Agents
auteur: Bilal Piot, Matthieu Geist, Olivier Pietquin
article: 4th Workshop on Machine Learning for Interactive Systems (MLIS 2015), Jul 2015, Lille, France
Accès au texte intégral et bibtex

titre: A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits
auteur: Pratik Gajane, Tanguy Urvoy, Fabrice Clérot
article: Proceedings of the 32nd International Conference on Machine Learning , Jul 2015, Lille, France. pp.218-227
Accès au texte intégral et bibtex

titre: Qualitative Multi-Armed Bandits: A Quantile-Based Approach
auteur: Balazs Szorenyi, Róbert Busa-Fekete, Paul Weng, Eyke Hüllermeier
article: 32nd International Conference on Machine Learning, Jul 2015, Lille, France. pp.1660-1668
Accès au texte intégral et bibtex

titre: Approximate dynamic programming for two-player zero-sum Markov games
auteur: Julien Perolat, Bruno Scherrer, Bilal Piot, Olivier Pietquin
article: International Conference on Machine Learning (ICML 2015), Jul 2015, Lille, France
Accès au texte intégral et bibtex

titre: First-order regret bounds for combinatorial semi-bandits
auteur: Gergely Neu
article: Proceedings of the 28th Annual Conference on Learning Theory (COLT), Jul 2015, Paris, France. pp.1360-1375
Accès au texte intégral et bibtex

titre: Improved Regret Bounds for Undiscounted Continuous Reinforcement Learning
auteur: Kailasam Lakshmanan, Ronald Ortner, Daniil Ryabko
article: International Conference on Machine Learning (ICML), Jul 2015, Lille, France
Accès au bibtex

titre: Simple regret for infinitely many armed bandits
auteur: Alexandra Carpentier, Michal Valko
article: International Conference on Machine Learning, Jul 2015, Lille, France
Accès au texte intégral et bibtex

titre: The Replacement Bootstrap for Dependent Data
auteur: Amir Sani, Alessandro Lazaric, Daniil Ryabko
article: Proceedings of the IEEE International Symposium on Information Theory, Jun 2015, Hong Kong, Hong Kong SAR China
Accès au texte intégral et bibtex

titre: Prédiction de performance sur des questions dichotomiques: comparaison de modèles pour des tests adaptatifs à grande échelle
auteur: Jill-Jênn Vie, Fabrice Popineau, Jean-Bastien Grill, Eric Bruillard, Yolaine Bourda
article: Atelier Évaluation des Apprentissages et Environnements Informatiques, EIAH 2015, Jun 2015, Agadir, Maroc
Accès au bibtex

titre: Predicting the outcomes of every process for which an asymptotically accurate stationary predictor exists is impossible
auteur: Daniil Ryabko, Boris Ryabko
article: International Symposium on Information Theory, Jun 2015, Hong Kong, Hong Kong SAR China. pp.1204-1206
Accès au texte intégral et bibtex

titre: Simultaneous Optimistic Optimization on the Noiseless BBOB Testbed
auteur: Bilel Derbel, Philippe Preux
article: The 17th IEEE Congress on Evolutionary Computation (CEC), May 2015, Sendai, Japan
Accès au bibtex

titre: An Experimental Protocol for Analyzing the Accuracy of Software Error Impact Analysis
auteur: Vincenzo Musco, Martin Monperrus, Philippe Preux
article: Tenth IEEE/ACM International Workshop on Automation of Software Test, May 2015, Florence, Italy. ⟨10.1109/AST.2015.20⟩
Accès au texte intégral et bibtex

titre: Cheap Bandits
auteur: Manjesh Kumar Hanawal Hanawal, Venkatesh Saligrama, Michal Valko, Rémi Munos
article: International Conference on Machine Learning, 2015, Lille, France
Accès au texte intégral et bibtex

titre: Black-box optimization of noisy functions with unknown smoothness
auteur: Jean-Bastien Grill, Michal Valko, Rémi Munos
article: Neural Information Processing Systems, 2015, Montréal, Canada
Accès au texte intégral et bibtex

Habilitation à diriger des recherches

titre: Data-Driven Recommender Systems
auteur: Jérémie Mary
article: Artificial Intelligence [cs.AI]. Université de Lille 3, 2015
Accès au texte intégral et bibtex

Other publications

titre: L’apprentissage automatique : le diable n’est pas dans l’algorithme
auteur: Philippe Preux, Marc Tommasi, Thierry Viéville, Colin de La Higuera
article: 2015
Accès au bibtex

Books

titre: Proceedings of The 4th Workshop on Machine Learning for Interactive Systems (MLIS2015)
auteur: Heriberto Cuayáhuitl, Nina Dethlefs, Lutz Frommberger, Martijn van Otterlo, Olivier Pietquin
article: , 43, 2015, JMLR Workshop and Conference Proceedings
Accès au bibtex

Poster communications

titre: Predicting Performance over Dichotomous Questions: Comparing Models for Large-Scale Adaptive Testing
auteur: Jill-Jênn Vie, Fabrice Popineau, Jean-Bastien Grill, Eric Bruillard, Yolaine Bourda
article: 8th International Conference on Educational Data Mining (EDM 2015), Jun 2015, Madrid, Spain
Accès au bibtex

Proceedings

titre: Collaborative Filtering with Localised Ranking
auteur: Charanpal Dhanjal, Romaric Gaudel, Stéphan Clémençon
article: pp.7, 2015, Proceedings of
Accès au bibtex

Theses

titre: Sequential Resource Allocation in Linear Stochastic Bandits
auteur: Marta Soare
article: Machine Learning [cs.LG]. Université Lille 1 – Sciences et Technologies, 2015. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Machine Learning for Decision Making
auteur: Amir Sani
article: Machine Learning [stat.ML]. Université de Lille 1, 2015. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: A Learning Algorithm for Change Impact Prediction: Experimentation on 7 Java Applications
auteur: Vincenzo Musco, Antonin Carette, Martin Monperrus, Philippe Preux
article: 2015
Accès au bibtex

titre: AUC Optimisation and Collaborative Filtering
auteur: Charanpal Dhanjal, Romaric Gaudel, Stéphan Clémençon
article: 2015
Accès au texte intégral et bibtex

2014

Journal articles

titre: Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm
auteur: Róbert Busa-Fekete, Balázs Szörényi, Paul Weng, Weiwei Cheng, Eyke Hüllermeier
article: Machine Learning, 2014, 97 (3), pp.327-351. ⟨10.1007/s10994-014-5458-8⟩
Accès au texte intégral et bibtex

titre: Efficient Eigen-updating for Spectral Graph Clustering
auteur: Charanpal Dhanjal, Romaric Gaudel, Stéphan Clémençon
article: Neurocomputing, 2014, 131, pp.440-452. ⟨10.1016/j.neucom.2013.11.015⟩
Accès au texte intégral et bibtex

titre: An experimental comparison of four magnetocaloric regenerators using three different materials
auteur: Ulrich Legait, Frédéric Guillou, Afef Kedous-Lebouc, Vincent Hardy, Morgan Almanza
article: International Journal of Refrigeration, 2014, 37, pp.147.155. ⟨10.1016/j.ijrefrig.2013.07.006⟩
Accès au bibtex

titre: Online Markov Decision Processes Under Bandit Feedback
auteur: Gergely Neu, András György, Csaba Szepesvári, András Antos
article: IEEE Transactions on Automatic Control, 2014, 59, pp.676 – 691. ⟨10.1109/TAC.2013.2292137⟩
Accès au texte intégral et bibtex

titre: Near-Optimal Rates for Limited-Delay Universal Lossy Source Coding
auteur: András György, Gergely Neu
article: IEEE Transactions on Information Theory, 2014, pp.2823-2834. ⟨10.1109/TIT.2014.2307062⟩
Accès au texte intégral et bibtex

titre: Regret bounds for restless Markov bandits
auteur: Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos
article: Theoretical Computer Science, 2014, 558, pp.62-76. ⟨10.1016/j.tcs.2014.09.026⟩
Accès au bibtex

titre: Uniform hypothesis testing for finite-valued stationary processes
auteur: Daniil Ryabko
article: Statistics, 2014, 48 (1), pp.121-128. ⟨10.1080/02331888.2012.719511⟩
Accès au texte intégral et bibtex

Conference papers

titre: Best-Arm Identification in Linear Bandits
auteur: Marta Soare, Alessandro Lazaric, Rémi Munos
article: NIPS – Advances in Neural Information Processing Systems 27, Dec 2014, Montreal, Canada
Accès au texte intégral et bibtex

titre: Optimistic planning in Markov decision processes using a generative model
auteur: Balázs Szörényi, Gunnar Kedenburg, Rémi Munos
article: Advances in Neural Information Processing Systems 27, Dec 2014, Montréal, Canada
Accès au texte intégral et bibtex

titre: Exploiting easy data in online optimization
auteur: Amir Sani, Gergely Neu, Alessandro Lazaric
article: Advances in Neural Information Processing 27, Dec 2014, Montreal, Canada
Accès au texte intégral et bibtex

titre: Online combinatorial optimization with stochastic decision sets and adversarial losses
auteur: Gergely Neu, Michal Valko
article: Neural Information Processing Systems, Dec 2014, Montréal, Canada
Accès au texte intégral et bibtex

titre: Efficient learning by implicit exploration in bandit problems with side observations
auteur: Tomáš Kocák, Gergely Neu, Michal Valko, Rémi Munos
article: Neural Information Processing Systems, Dec 2014, Montréal, Canada
Accès au texte intégral et bibtex

titre: Extreme bandits
auteur: Alexandra Carpentier, Michal Valko
article: Neural Information Processing Systems, Dec 2014, Montréal, Canada
Accès au texte intégral et bibtex

titre: Sparse Multi-task Reinforcement Learning
auteur: Daniele Calandriello, Alessandro Lazaric, Marcello Restelli
article: NIPS – Advances in Neural Information Processing Systems 26, Dec 2014, Montreal, Canada
Accès au texte intégral et bibtex

titre: Subspace Identification for Predictive State Representation by Nuclear Norm Minimization
auteur: Hadrien Glaude, Cyrille Enderli, Olivier Pietquin
article: IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2014), Dec 2014, Orlando, United States
Accès au bibtex

titre: Difference of Convex Functions Programming for Reinforcement Learning
auteur: Bilal Piot, Matthieu Geist, Olivier Pietquin
article: Advances in Neural Information Processing Systems (NIPS 2014), Dec 2014, Montreal, Canada
Accès au texte intégral et bibtex

titre: Selecting Near-Optimal Approximate State Representations in Reinforcement Learning
auteur: Ronald Ortner, Odalric-Ambrym Maillard, Daniil Ryabko
article: International Conference on Algorithmic Learning Theory (ALT), Oct 2014, Bled, Slovenia. pp.140-154
Accès au bibtex

titre: CoAdapt P300 speller: optimized flashing sequences and online learning
auteur: Eoin Thomas, Emmanuel Daucé, Dieter Devlaminck, Loïc Mahé, Alexandra Carpentier, Rémi Munos, Margaux Perrin, Emmanuel Maby, Jérémie Mattout, Théodore Papadopoulo, Maureen Clerc
article: 6th International Brain Computer Interface Conference, Sep 2014, Graz, Austria
Accès au texte intégral et bibtex

titre: Predicting when to laugh with structured classification
auteur: Bilal Piot, Olivier Pietquin, Matthieu Geist
article: InterSpeech 2014, Sep 2014, Singapore, Singapore. pp.1786-1790
Accès au texte intégral et bibtex

titre: A diffusion strategy for distributed dictionary learning
auteur: Pierre Chainais, Cédric Richard
article: 2nd “international Traveling Workshop on Interactions between Sparse models and Technology” (iTWIST’14), Laurent Jacques, Aug 2014, Namur, Belgium
Accès au texte intégral et bibtex

titre: Biclique Coverings, Rectifier Networks and the Cost of ε-Removal
auteur: Szabolcs Iván, Ádám Lelkes, Judit Nagy-György, Balázs Szörényi, György Turán
article: 16th International Workshop on Descriptional Complexity of Formal Systems, Proceedings, Aug 2014, Turku, Finland. pp.174 – 185, ⟨10.1007/978-3-319-09704-6_16⟩
Accès au texte intégral et bibtex

titre: Spectral Bandits for Smooth Graph Functions with Applications in Recommender Systems
auteur: Tomáš Kocák, Michal Valko, Rémi Munos, Branislav Kveton, Shipra Agrawal
article: AAAI Workshop on Sequential Decision-Making with Big Data, Jul 2014, Québec City, Canada
Accès au texte intégral et bibtex

titre: Spectral Thompson Sampling
auteur: Tomáš Kocák, Michal Valko, Rémi Munos, Shipra Agrawal
article: AAAI Conference on Artificial Intelligence, Jul 2014, Québec City, Canada
Accès au texte intégral et bibtex

titre: PAC Rank Elicitation through Adaptive Sampling of Stochastic Pairwise Preferences
auteur: Róbert Busa-Fekete, Balázs Szörényi, Eyke Hüllermeier
article: 28th AAAI Conference on Artificial Intelligence (AAAI-14), Jul 2014, Quebec City, Canada
Accès au texte intégral et bibtex

titre: Bandits attack function optimization
auteur: Philippe Preux, Rémi Munos, Michal Valko
article: IEEE Congress on Evolutionary Computation, Jul 2014, Beijing, China
Accès au texte intégral et bibtex

titre: Benthic sensitive habitat mapping: a new tool for ecosystem management ?
auteur: Aurélie Foveau, Sandrine Vaz, Nicolas Desroy, Vladimir E. Kostylev, Jean-Claude Dauvin, André Carpentier
article: Forum TransChannel ‘Science and Governance of the Channel Marine Ecosystem, Jul 2014, Caen, France
Accès au bibtex

titre: Improving offline evaluation of contextual bandit algorithms via bootstrapping techniques
auteur: Olivier Nicol, Jérémie Mary, Philippe Preux
article: International Conference on Machine Learning, Jun 2014, Beijing, China
Accès au texte intégral et bibtex

titre: Preference-Based Rank Elicitation using Statistical Models: The Case of Mallows
auteur: Róbert Busa-Fekete, Eyke Hüllermeier, Balázs Szörényi
article: Proceedings of The 31st International Conference on Machine Learning, Jun 2014, Beijing, China
Accès au texte intégral et bibtex

titre: Online Stochastic Optimization under Correlated Bandit Feedback
auteur: Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill
article: 31st International Conference on Machine Learning, Jun 2014, Beijing, China
Accès au texte intégral et bibtex

titre: Asymptotically consistent estimation of the number of change points in highly dependent time series
auteur: Azadeh Khaleghi, Daniil Ryabko
article: International Conference on Machine Learning (ICML), Jun 2014, Beijing, China. pp.539-547
Accès au bibtex

titre: Spectral Bandits for Smooth Graph Functions
auteur: Michal Valko, Rémi Munos, Branislav Kveton, Tomáš Kocák
article: International Conference on Machine Learning, May 2014, Beijing, China
Accès au texte intégral et bibtex

titre: Méthode de minimisation du résidu de Bellman boostée qui tient compte des démonstrations expertes.
auteur: Bilal Piot, Matthieu Geist, Olivier Pietquin
article: 9èmes Journées Francophones de Planification, Décision et Apprentissage (JFPDA’14), May 2014, Liège, Belgique
Accès au bibtex

titre: Quantitative control of the error bounds of a fast super-resolution technique for microscopy and astronomy
auteur: Pierre Chainais, Pierre Pfennig, Aymeric Leray
article: Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), May 2014, Florence, Italy. pp.2853 – 2857, ⟨10.1109/ICASSP.2014.6854121⟩
Accès au texte intégral et bibtex

titre: Online Matrix Completion Through Nuclear Norm Regularisation
auteur: Charanpal Dhanjal, Romaric Gaudel, Stéphan Clémençon
article: SDM – SIAM International Conference on Data Mining, Apr 2014, Philadelphia, United States. ⟨10.1137/1.9781611973440.72⟩
Accès au texte intégral et bibtex

titre: Evidence build-up facilitates on-line adaptivity in dynamic environments: example of the BCI P300-speller
auteur: Emmanuel Daucé, Eoin Thomas
article: 22nd European Symposium on Artificial Neural Networks, Apr 2014, Bruges, Belgium
Accès au texte intégral et bibtex

titre: Synthèse en espace et temps du rayonnement acoustique d’une paroi sous excitation turbulente par synthèse spectrale 2D+T et formulation vibro-acoustique directe
auteur: Marc Pachebat, Nicolas Totaro, Pierre Chainais, Olivier Collery
article: Congrès Français d’acoustique 2014, Apr 2014, Poitiers, France. 6 p., p1921, papier N183
Accès au texte intégral et bibtex

titre: MESSI: Maximum Entropy Semi-Supervised Inverse Reinforcement Learning
auteur: Julien Audiffren, Michal Valko, Alessandro Lazaric, Mohammad Ghavamzadeh
article: NIPS Workshop on Novel Trends and Applications in Reinforcement Learning, 2014, Montreal, Canada
Accès au texte intégral et bibtex

Other publications

titre: User Engagement as Evaluation: a Ranking or a Regression Problem?
auteur: Frédéric Guillou, Romaric Gaudel, Jérémie Mary, Philippe Preux
article: 2014, ⟨10.1145/2668067.2668073⟩
Accès au texte intégral et bibtex

Reports

titre: Bandits Warm-up Cold Recommender Systems
auteur: Jérémie Mary, Romaric Gaudel, Philippe Preux
article: [Research Report] RR-8563, INRIA Lille; INRIA. 2014, pp.18
Accès au texte intégral et bibtex

titre: From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning
auteur: Rémi Munos
article: 2014
Accès au texte intégral et bibtex

titre: A Generative Model of Software Dependency Graphs to Better Understand Software Evolution
auteur: Vincenzo Musco, Martin Monperrus, Philippe Preux
article: [Technical Report] hal-01078716, Inria. 2014
Accès au bibtex

Theses

titre: Data-driven evaluation of Contextual Bandit algorithms and applications to Dynamic Recommendation
auteur: Olivier Nicol
article: Machine Learning [stat.ML]. Université de Lille I, 2014. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Mining Software Engineering Data for Useful Knowledge
auteur: Boris Baldassari
article: Machine Learning [stat.ML]. Université de Lille, 2014. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Budgeted Classification-based Policy Iteration
auteur: Victor Gabillon
article: Machine Learning [stat.ML]. Universite Lille 1, 2014. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

2013

Journal articles

titre: Kullback-Leibler Upper Confidence Bounds for Optimal Sequential Allocation
auteur: Olivier Cappé, Aurélien Garivier, Odalric-Ambrym Maillard, Rémi Munos, Gilles Stoltz
article: Annals of Statistics, 2013, 41 (3), pp.1516-1541. ⟨10.1214/13-AOS1119⟩
Accès au texte intégral et bibtex

titre: Outlier detection for patient monitoring and alerting
auteur: Milos Hauskrecht, Iyad Batal, Michal Valko, Shyam Visweswaran, Gregory F Cooper, Gilles Clermont
article: Journal of Biomedical Informatics, 2013, 46, pp.47-55. ⟨10.1016/j.jbi.2012.08.004⟩
Accès au bibtex

titre: Automatic motor task selection via a bandit algorithm for a brain-controlled button
auteur: Joan Fruitet, Alexandra Carpentier, Rémi Munos, Maureen Clerc
article: Journal of Neural Engineering, 2013, 10 (1), ⟨10.1088/1741-2560/10/1/016012⟩
Accès au bibtex

titre: A confidence-set approach to signal denoising
auteur: Boris Ryabko, Daniil Ryabko
article: Statistical Methodology, 2013, 15, pp.115–120. ⟨10.1016/j.stamet.2013.05.003⟩
Accès au bibtex

titre: A Binary-Classification-Based Metric between Time-Series Distributions and Its Use in Statistical and Learning Problems
auteur: Daniil Ryabko, Jérémie Mary
article: Journal of Machine Learning Research, 2013, 14, pp.2837-2856
Accès au bibtex

titre: Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model
auteur: Mohammad Gheshlaghi Azar, Rémi Munos, Hilbert Kappen
article: Machine Learning, 2013, 91 (3), pp.325-349. ⟨10.1007/s10994-013-5368-1⟩
Accès au texte intégral et bibtex

Conference papers

titre: Learning a common dictionary over a sensor network
auteur: Pierre Chainais, Cédric Richard
article: CAMSAP 2013, Dec 2013, Saint-Martin, France. pp.1-4
Accès au texte intégral et bibtex

titre: Thompson Sampling for one-dimensial exponential family bandits
auteur: Nathaniel Korda, Emilie Kaufmann, Rémi Munos
article: NIPS 2013 – Neural Information Processing Systems Conference, Dec 2013, Lake Tahoe, United States
Accès au bibtex

titre: Approximate Dynamic Programming Finally Performs Well in the Game of Tetris
auteur: Victor Gabillon, Mohammad Ghavamzadeh, Bruno Scherrer
article: Neural Information Processing Systems (NIPS) 2013, Dec 2013, South Lake Tahoe, United States
Accès au texte intégral et bibtex

titre: Online Learning in Episodic Markovian Decision Processes by Relative Entropy Policy Search
auteur: Alexander Zimin, Gergely Neu
article: Neural Information Processing Systems 26, Dec 2013, Lake Tahoe, United States
Accès au texte intégral et bibtex

titre: Sequential Transfer in Multi-armed Bandit with Finite Set of Models
auteur: Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill
article: NIPS – Advances in Neural Information Processing Systems 25 – 2013, Dec 2013, Lake Tahoe, United States
Accès au texte intégral et bibtex

titre: Optimizing P300-speller sequences by RIP-ping groups apart
auteur: Eoin M. Thomas, Maureen Clerc, Alexandra Carpentier, Emmanuel Daucé, Dieter Devlaminck, Rémi Munos
article: IEEE/EMBS 6th international conference on neural engineering (2013), IEEE/EMBS, Nov 2013, San Diego, United States
Accès au texte intégral et bibtex

titre: Quantification adaptative pour la stéganalyse d’images texturées
auteur: Emmanuel Zidel – Cauffet, Patrick Bas, Pierre Chainais
article: GRETSI 2013, Sep 2013, Brest, France
Accès au texte intégral et bibtex

titre: Regret Bounds for Reinforcement Learning with Policy Advice
auteur: Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill
article: ECML/PKDD – European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Sep 2013, Prague, Czech Republic
Accès au texte intégral et bibtex

titre: Finite-Time Analysis of Kernelised Contextual Bandits
auteur: Michal Valko, Nathan Korda, Rémi Munos, Ilias Flaounas, Nello Cristianini
article: Uncertainty in Artificial Intelligence, Jul 2013, Bellevue, United States
Accès au texte intégral et bibtex

titre: Distributed dictionary learning over a sensor network
auteur: Pierre Chainais, Cédric Richard
article: CaP 2013, Jul 2013, Villeneuve d’Ascq, France. pp.1-4
Accès au texte intégral et bibtex

titre: Cost-sensitive Multiclass Classification Risk Bounds
auteur: Bernardo Avila Pires, Mohammad Ghavamzadeh, Csaba Szepesvari
article: International Conference on Machine Learning, Jun 2013, Atlanta, United States
Accès au texte intégral et bibtex

titre: Gossip-based distributed stochastic bandit algorithms
auteur: Balázs Szorenyi, Róbert Busa-Fekete, Istvan Hegedüs, Róbert Ormandi, Márk Jelasity, Balázs Kégl
article: ICML 2013 – 30th International Conference on Machine Learning, Jun 2013, Atlanta, United States. pp.19-27
Accès au texte intégral et bibtex

titre: Stochastic Simultaneous Optimistic Optimization
auteur: Michal Valko, Alexandra Carpentier, Rémi Munos
article: International Conference on Machine Learning, Jun 2013, Atlanta, United States
Accès au texte intégral et bibtex

titre: A Generalized Kernel Approach to Structured Output Learning
auteur: Hachem Kadri, Mohammad Ghavamzadeh, Philippe Preux
article: International Conference on Machine Learning (ICML), Jun 2013, Atlanta, United States
Accès au texte intégral et bibtex

titre: Learning from a Single Labeled Face and a Stream of Unlabeled Data
auteur: Branislav Kveton, Michal Valko
article: 10th IEEE International Conference on Automatic Face and Gesture Recognition, Apr 2013, Shanghai, China
Accès au texte intégral et bibtex

titre: Optimistic planning for belief-augmented Markov decision processes
auteur: Raphael Fonteneau, Lucian Busoniu, Rémi Munos
article: IEEE International Symposium on Adaptive Dynamic Programming and reinforcement Learning, ADPRL 2013, Apr 2013, Singapour, Singapore. pp.CDROM
Accès au texte intégral et bibtex

titre: Time-series information and learning
auteur: Daniil Ryabko
article: ISIT – International Symposium on Information Theory, 2013, Istanbul, Turkey. pp.1392-1395
Accès au bibtex

titre: Nonparametric multiple change point estimation in highly dependent time series
auteur: Azadeh Khaleghi, Daniil Ryabko
article: Proc. 24th International Conf. on Algorithmic Learning Theory (ALT’13), 2013, Singapore, Singapore. pp.382-396
Accès au bibtex

titre: Unsupervised model-free representation learning
auteur: Daniil Ryabko
article: Proc. 24th International Conf. on Algorithmic Learning Theory (ALT’13), 2013, Singapore, Singapore. pp.354-366
Accès au bibtex

titre: Toward optimal stratification for stratified monte-carlo integration
auteur: Alexandra Carpentier, Rémi Munos
article: International Conference on Machine Learning, 2013, United States
Accès au texte intégral et bibtex

titre: Optimal Regret Bounds for Selecting the State Representation in Reinforcement Learning
auteur: Odalric-Ambrym Maillard, Phuong Nguyen, Ronald Ortner, Daniil Ryabko
article: ICML – 30th International Conference on Machine Learning, 2013, Atlanta, USA, United States. pp.543-551
Accès au texte intégral et bibtex

titre: Thompson sampling for one-dimensional exponential family bandits
auteur: Nathaniel Korda, Emilie Kaufmann, Rémi Munos
article: Advances in Neural Information Processing Systems, 2013, United States
Accès au texte intégral et bibtex

titre: Aggregating optimistic planning trees for solving markov decision processes
auteur: Gunnar Kedenburg, Raphael Fonteneau, Remi Munos
article: Advances in Neural Information Processing Systems, 2013, United States. pp.2382-2390
Accès au texte intégral et bibtex

titre: Competing with an Infinite Set of Models in Reinforcement Learning
auteur: Phuong Nguyen, Odalric-Ambrym Maillard, Daniil Ryabko, Ronald Ortner
article: AISTATS, 2013, Arizona, United States. pp.463-471
Accès au bibtex

Book sections

titre: A review of optimistic planning in Markov decision processes
auteur: Lucian Busoniu, Remi Munos, Robert Babuska
article: Frank Lewis and Derong Liu. Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control, Wiley-IEEE Press, pp.494-516, 2013, IEEE Press Series on Computational Intelligence, 978-1-1181-0420-0
Accès au bibtex

Reports

titre: Actor-Critic Algorithms for Risk-Sensitive MDPs
auteur: Prashanth L.A., Mohammad Ghavamzadeh
article: [Technical Report] 2013
Accès au texte intégral et bibtex

Theses

titre: On Some Unsupervised Learning Problems for Highly Dependent Time Series
auteur: Azadeh Khaleghi
article: Statistics [math.ST]. Institut national de recherche en informatique et en automatique (INRIA), 2013. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

2012

Journal articles

titre: Sequential approaches for learning datum-wise sparse representations
auteur: Gabriel Dulac-Arnold, Ludovic Denoyer, Philippe Preux, Patrick Gallinari
article: Machine Learning, 2012, 89 (1-2), pp.87-122. ⟨10.1007/s10994-012-5306-7⟩
Accès au texte intégral et bibtex

titre: Finite-Sample Analysis of Least-Squares Policy Iteration
auteur: Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos
article: Journal of Machine Learning Research, 2012, 13, pp.3041-3074
Accès au texte intégral et bibtex

titre: Learning with stochastic inputs and adversarial outputs
auteur: Alessandro Lazaric, Rémi Munos
article: Journal of Computer and System Sciences, 2012, 78 (5), pp.1516-1537. ⟨10.1016/j.jcss.2011.12.027⟩
Accès au texte intégral et bibtex

titre: Dislocation detection in field environments: A belief functions contribution
auteur: Saiedeh Razavi, Emmanuel Duflos, Carl Haas, Philippe Vanheeghe
article: Expert Systems with Applications, 2012, 39 (10), pp.8505-8513. ⟨10.1016/j.eswa.2011.12.014⟩
Accès au bibtex

titre: Dirichlet Process Mixtures for Density Estimation in Dynamic Nonlinear Modeling: Application to GPS Positioning in Urban Canyons
auteur: Asma Rabaoui, Nicolas Viandier, Juliette Marais, Emmanuel Duflos, Philippe Vanheeghe
article: IEEE Transactions on Signal Processing, 2012, 60 (4), pp.1638 – 1655. ⟨10.1109/TSP.2011.2180901⟩
Accès au texte intégral et bibtex

titre: Managing advertising campaigns — an approximate planning approach
auteur: Sertan Girgin, Jérémie Mary, Philippe Preux, Olivier Nicol
article: Frontiers of Computer Science, 2012, 6 (2), pp.209-229. ⟨10.1007/s11704-012-2873-5⟩
Accès au texte intégral et bibtex

titre: Linear Regression with Random Projections
auteur: Odalric Maillard, Rémi Munos
article: Journal of Machine Learning Research, 2012, 13 (1), pp.2735-2772
Accès au texte intégral et bibtex

titre: Testing composite hypotheses about discrete ergodic processes
auteur: Daniil Ryabko
article: Test, 2012, 21 (2), pp.317-329. ⟨10.1007/s11749-011-0245-3⟩
Accès au bibtex

Conference papers

titre: Multiple Operator-valued Kernel Learning
auteur: Hachem Kadri, Alain Rakotomamonjy, Francis Bach, Philippe Preux
article: Neural Information Processing Systems (NIPS), Dec 2012, Lake Tahoe, United States
Accès au texte intégral et bibtex

titre: Risk-Aversion in Multi-armed Bandits
auteur: Amir Sani, Alessandro Lazaric, Rémi Munos
article: NIPS – Twenty-Sixth Annual Conference on Neural Information Processing Systems, Dec 2012, Lake Tahoe, United States
Accès au texte intégral et bibtex

titre: Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
auteur: Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric
article: NIPS – Twenty-Sixth Annual Conference on Neural Information Processing Systems, Dec 2012, Lake Tahoe, United States
Accès au texte intégral et bibtex

titre: Reducing statistical time-series problems to binary classification
auteur: Daniil Ryabko, Jérémie Mary
article: NIPS, Dec 2012, Lake Tahoe, United States. pp.2069–2077
Accès au texte intégral et bibtex

titre: Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
auteur: Emilie Kaufmann, Nathaniel Korda, Rémi Munos
article: ALT 2012 – International Conference on Algorithmic Learning Theory, Oct 2012, Lyon, France. pp.199-213, ⟨10.1007/978-3-642-34106-9_18⟩
Accès au bibtex

titre: Fast Reinforcement Learning with Large Action Sets Using Error-Correcting Output Codes for MDP Factorization
auteur: Gabriel Dulac-Arnold, Ludovic Denoyer, Philippe Preux, Patrick Gallinari
article: European Conference on Machine Learning, Sep 2012, Bristol, United Kingdom. pp.180-194, ⟨10.1007/978-3-642-33486-3_12⟩
Accès au texte intégral et bibtex

titre: Towards dictionary learning from images with non Gaussian noise
auteur: Pierre Chainais
article: IEEE Int. Workshop on Machine Learning for Signal Processing, Sep 2012, Santander, Spain
Accès au texte intégral et bibtex

titre: ASYMPTOTIC STATISTICAL ANALYSIS OF STATIONARY ERGODIC TIME SERIES
auteur: Daniil Ryabko
article: WITMSE 2012, Aug 2012, Amsterdam, Netherlands
Accès au texte intégral et bibtex

titre: Conservative and Greedy Approaches to Classification-based Policy Iteration
auteur: Mohammad Ghavamzadeh, Alessandro Lazaric
article: AAAI – 26th Conference on Artificial Intelligence, Jul 2012, Toronto, Canada
Accès au texte intégral et bibtex

titre: Semi-Supervised Apprenticeship Learning
auteur: Michal Valko, Mohammad Ghavamzadeh, Alessandro Lazaric
article: The 10th European Workshop on Reinforcement Learning (EWRL 2012), Jun 2012, Edinburgh, United Kingdom. pp.131-141
Accès au texte intégral et bibtex

titre: A Dantzig Selector Approach to Temporal Difference Learning
auteur: Matthieu Geist, Bruno Scherrer, Alessandro Lazaric, Mohammad Ghavamzadeh
article: ICML-12, Jun 2012, Edinburgh, United Kingdom. pp.1399-1406
Accès au bibtex

titre: Approximate Modified Policy Iteration
auteur: Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, Matthieu Geist
article: 29th International Conference on Machine Learning – ICML 2012, Jun 2012, Edinburgh, United Kingdom
Accès au texte intégral et bibtex

titre: A Truthful Learning Mechanism for Contextual Multi–Slot Sponsored Search Auctions with Externalities
auteur: Nicola Gatti, Alessandro Lazaric, Francesco Trov'{o}
article: EC – 13th ACM Conference on Electronic Commerce, Jun 2012, Valencia, Spain
Accès au texte intégral et bibtex

titre: Classification Localement Parcimonieuse par Méthodes Séquentielles
auteur: Gabriel Dulac-Arnold, Ludovic Denoyer, Philippe Preux, Patrick Gallinari
article: CAP 2012 – Conférence Francophone sur l’Apprentissage Automatique, May 2012, Nancy, France
Accès au bibtex

titre: Un sélecteur de Dantzig pour l’apprentissage par différences temporelles
auteur: Matthieu Geist, Bruno Scherrer, Alessandro Lazaric, Mohammad Ghavamzadeh
article: Journées Francophones sur la planification, la décision et l’apprentissage pour le contrôle des systèmes – JFPDA 2012, May 2012, Villers-lès-Nancy, France. 13 p
Accès au texte intégral et bibtex

titre: Approximations de l’Algorithme Itérations sur les Politiques Modifié
auteur: Bruno Scherrer, Victor Gabillon, Mohammad Ghavamzadeh, Matthieu Geist
article: Journées Francophones sur la planification, la décision et l’apprentissage pour le contrôle des systèmes – JFPDA 2012, May 2012, Villers-lès-Nancy, France. 1 p
Accès au bibtex

titre: Apprentissage par renforcement rapide pour des grands ensembles d’actions en utilisant des codes correcteurs d’erreur
auteur: Gabriel Dulac-Arnold, Ludovic Denoyer, Philippe Preux, Patrick Gallinari
article: Journées Francophones sur la planification, la décision et l’apprentissage pour le contrôle des systèmes – JFPDA 2012, May 2012, Villers-lès-Nancy, France. 12 p
Accès au texte intégral et bibtex

titre: DPM pour l’inférence dans les modèles dynamiques non linéaires avec des bruits de mesure alpha-stable
auteur: Nouha Jaoua, Emmanuel Duflos, Philippe Vanheeghe
article: 44ème Journées de Statistique, May 2012, Bruxelles, Belgique. pp.1-4
Accès au bibtex

titre: Optimistic Planning for Markov Decision Processes
auteur: Lucian Busoniu, Remi Munos
article: 15th International Conference on Artificial Intelligence and Statistics, AISTATS-12, Apr 2012, La Palma, Canary Islands, Spain. pp.182-189
Accès au texte intégral et bibtex

titre: Regret Bounds for Restless Markov Bandits
auteur: Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos
article: ALT 2012, 2012, Lyon, France. pp.214–228
Accès au bibtex

titre: Online Clustering of Processes
auteur: Azadeh Khaleghi, Daniil Ryabko, Jérémie Mary, Philippe Preux
article: AISTATS 2012, 2012, La Palma, Spain. pp.601-609
Accès au bibtex

titre: Adaptive Stratified Sampling for Monte-Carlo integration of Differentiable functions
auteur: Alexandra Carpentier, Rémi Munos
article: Advances in Neural Information Processing Systems, 2012, Lake Tahoe, United States
Accès au bibtex

titre: On the Sample Complexity of Reinforcement Learning with a Generative Model
auteur: Mohammad Gheshlaghi Azar, Rémi Munos, Hilbert Kappen
article: International Conference on Machine Learning, 2012, United Kingdom
Accès au texte intégral et bibtex

titre: Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
auteur: Ronald Ortner, Daniil Ryabko
article: NIPS 2012, 2012, Lake Tahoe, United States. pp.1772–1780
Accès au bibtex

titre: Incremental Decision Tree based on order statistics
auteur: Christophe Salperwyck, Vincent Lemaire
article: Workshop on Active and Incremental Learning (without proceedings), Vincent Lemaire, Pascal Cuxac and Jean-Charles Lamirel, 2012, Montpellier, France
Accès au texte intégral et bibtex

titre: Locating Changes in Highly Dependent Data with Unknown Number of Change Points
auteur: Azadeh Khaleghi, Daniil Ryabko
article: NIPS 2012, 2012, Lake Tahoe, United States. pp.3095–3103
Accès au bibtex

titre: Bandit Algorithms boost Brain Computer Interfaces for motor-task selection of a brain-controlled button
auteur: Joan Fruitet, Alexandra Carpentier, Rémi Munos, Maureen Clerc
article: Advances in Neural Information Processing Systems, 2012, Lake Tahoe, Nevada, United States. pp.458–466
Accès au texte intégral et bibtex

Book sections

titre: Transfer in Reinforcement Learning: a Framework and a Survey
auteur: Alessandro Lazaric
article: Marco Wiering, Martijn van Otterlo. Reinforcement Learning – State of the art, 12, Springer, pp.143-173, 2012, ⟨10.1007/978-3-642-27645-3_5⟩
Accès au texte intégral et bibtex

titre: Bayesian Reinforcement Learning
auteur: Nikos Vlassis, Mohammad Ghavamzadeh, Shie Mannor, Pascal Poupart
article: Marco Wiering and Martijn van Otterlo. Reinforcement Learning: State of the Art, Springer Verlag, 2012
Accès au texte intégral et bibtex

Reports

titre: Risk-Aversion in Multi-armed Bandits
auteur: Amir Sani, Alessandro Lazaric, Rémi Munos
article: [Research Report] 2012
Accès au texte intégral et bibtex

titre: Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
auteur: Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric
article: [Research Report] 2012
Accès au texte intégral et bibtex

titre: Approximate Modified Policy Iteration
auteur: Bruno Scherrer, Victor Gabillon, Mohammad Ghavamzadeh, Matthieu Geist
article: [Research Report] 2012
Accès au texte intégral et bibtex

titre: A Truthful Learning Mechanism for Contextual Multi-Slot Sponsored Search Auctions with Externalities
auteur: Alessandro Lazaric, Nicola Gatti, Trov'{o} Francesco
article: [Research Report] 2012
Accès au bibtex

titre: Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit
auteur: Alexandra Carpentier, Rémi Munos
article: [Technical Report] 2012
Accès au texte intégral et bibtex

Theses

titre: On optimal Sampling in low and high dimension
auteur: Alexandra Carpentier
article: Statistics [math.ST]. Université des Sciences et Technologie de Lille – Lille I, 2012. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Planification Optimiste pour Systèmes Déterministes
auteur: Jean-Francois Hren
article: Apprentissage [cs.LG]. Université des Sciences et Technologie de Lille – Lille I, 2012. Français. ⟨NNT : ⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: On the Sample Complexity of Reinforcement Learning with a Generative Model
auteur: Mohammad Gheshlaghi Azar, Remi Munos, Bert Kappen
article: 2012
Accès au bibtex

titre: Minimax Number of Strata for Online Stratified Sampling given Noisy Samples
auteur: Alexandra Carpentier, Rémi Munos
article: 2012
Accès au texte intégral et bibtex

2011

Journal articles

titre: Constructing perfect steganographic systems
auteur: Boris Ryabko, Daniil Ryabko
article: Information and Computation, 2011, 209 (9), pp.1223-1230. ⟨10.1016/j.ic.2011.06.004⟩
Accès au bibtex

titre: X-Armed Bandits
auteur: Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvari
article: Journal of Machine Learning Research, 2011, 12, pp.1655-1695
Accès au texte intégral et bibtex

titre: Pure exploration in finitely-armed and continuous-armed bandits
auteur: Gilles Stoltz, Sébastien Bubeck, Rémi Munos
article: Theoretical Computer Science, 2011, 412 (19), pp.1832-1852. ⟨10.1016/j.tcs.2010.12.059⟩
Accès au bibtex

titre: On the relation between realizable and non-realizable cases of the sequence prediction problem
auteur: Daniil Ryabko
article: Journal of Machine Learning Research, 2011, 12, pp.2161-2180
Accès au bibtex

titre: Identification of microbial and proteomic biomarkers in early childhood caries
auteur: Thomas C. Hart, Patricia M. Corby, Milos Hauskrecht, Ok Hee Ryu, Richard Pelikan, Michal Valko, Maria B. Oliveira, Gerald T. Hoehn, Walter A. Bretz
article: International Journal of Dentistry, 2011, 2011, pp.196721. ⟨10.1155/2011/196721⟩
Accès au texte intégral et bibtex

titre: Aligned carbon nanotube based ultrasonic microtransducers for durability monitoring in civil engineering
auteur: Bérengère Lebental, Pierre Chainais, Pascale Chenevier, Nicolas Chevalier, Eric Delevoye, Jean-Marc Fabbri, Sergio Nicoletti, Philippe Renaux, Anne Ghis
article: Nanotechnology, 2011, 22 (39), pp.395501. ⟨10.1088/0957-4484/22/39/395501⟩
Accès au texte intégral et bibtex

Conference papers

titre: Incremental Spectral Clustering with the Normalised Laplacian
auteur: Charanpal Dhanjal, Romaric Gaudel, Stéphan Clémençon
article: DISCML – 3rd NIPS Workshop on Discrete Optimization in Machine Learning – 2011, Dec 2011, Sierra Nevada, Spain
Accès au texte intégral et bibtex

titre: Finite-Time Analysis of Stratified Sampling for Monte Carlo
auteur: Alexandra Carpentier, Rémi Munos
article: NIPS – Twenty-Fifth Annual Conference on Neural Information Processing Systems, Dec 2011, Grenade, Spain
Accès au texte intégral et bibtex

titre: Conditional Anomaly Detection with Soft Harmonic Functions
auteur: Michal Valko, Branislav Kveton, Hamed Valizadegan, Gregory Cooper, Milos Hauskrecht
article: Proceedings of the 2011 IEEE International Conference on Data Mining, Dec 2011, Vancouver, Canada
Accès au texte intégral et bibtex

titre: Selecting the State-Representation in Reinforcement Learning
auteur: Odalric-Ambrym Maillard, Rémi Munos, Daniil Ryabko
article: Neural Information Processing Systems, Dec 2011, Granada, Spain
Accès au bibtex

titre: Transfer from Multiple MDPs
auteur: Alessandro Lazaric, Marcello Restelli
article: NIPS – Twenty-Fifth Annual Conference on Neural Information Processing Systems, Dec 2011, Granada, Spain
Accès au texte intégral et bibtex

titre: CLOP: Confident Local Optimization for Noisy Black-Box Parameter Tuning
auteur: Rémi Coulom
article: Advances in Computer Games – 13th International Conference, Nov 2011, Tilburg, Netherlands. pp.146-157, ⟨10.1007/978-3-642-31866-5_13⟩
Accès au bibtex

titre: Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits
auteur: Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Peter Auer
article: ALT – the 22nd conference on Algorithmic Learning Theory, Oct 2011, Espoo, Finland
Accès au texte intégral et bibtex

titre: Caractérisation statistique d’une assemblée de nanotubes en imagerie microscopique
auteur: Pierre Chainais, Bérengère Lebental
article: GRETSI, Sep 2011, France. 4p
Accès au texte intégral et bibtex

titre: Datum-wise classification. A sequential Approach to sparsity
auteur: Gabriel Dulac-Arnold, Ludovic Denoyer, Philippe Preux, Patrick Gallinari
article: ECML PKDD 2011 – European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Sep 2011, Athens, Greece. pp.375-390, ⟨10.1007/978-3-642-23780-5_34⟩
Accès au texte intégral et bibtex

titre: A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences
auteur: Odalric-Ambrym Maillard, Rémi Munos, Gilles Stoltz
article: 24th Annual Conference on Learning Theory : COLT’11, Jul 2011, Budapest, Hungary. pp.18
Accès au texte intégral et bibtex

titre: Stumping along a Summary for Exploration & Exploitation Challenge 2011
auteur: Christophe Salperwyck, Tanguy Urvoy
article: Workshop on On-line Trading of Exploration and Exploitation 2, Jul 2011, Bellevue, Washington, United States. pp.86-97
Accès au texte intégral et bibtex

titre: ICML Exploration & Exploitation challenge: Keep it simple!
auteur: Olivier Nicol, Jérémie Mary, Philippe Preux
article: Proceedings of the Workshop on On-line Trading of Exploration and Exploitation 2, Jul 2011, Bellevue, Washington, United States. pp.62-85
Accès au texte intégral et bibtex

titre: Confidence Sets in Time-Series Filtering
auteur: Boris Ryabko, Daniil Ryabko
article: IEEE International Symposium on Information Theory, Jul 2011, St. Petersburg, Russia. pp.2436-2438
Accès au bibtex

titre: Multi-Sensor PHD by Space Partionning: Computation of a True Reference Density Within The PHD Framework
auteur: Emmanuel Delande, Emmanuel Duflos, Philippe Vanheeghe, Dominique Heurguier
article: Statistical Signal Processing Workshop (SSP), 2011, IEEE – Signal Processing Society, Jun 2011, Nice, France. pp.333 – 336, ⟨10.1109/SSP.2011.5967695⟩
Accès au texte intégral et bibtex

titre: Conditional Anomaly Detection Using Soft Harmonic Functions: An Application to Clinical Alerting
auteur: Michal Valko, Hamed Valizadegan, Branislav Kveton, Gregory Cooper, Milos Hauskrecht
article: The 28th International Conference on Machine Learning Workshop on Machine Learning for Global Challenges, Jun 2011, Seattle, United States
Accès au texte intégral et bibtex

titre: Classification-based Policy Iteration with a Critic
auteur: Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, Bruno Scherrer
article: International Conference on Machine Learning (ICML), Jun 2011, Seattle, United States. pp.1049-1056
Accès au texte intégral et bibtex

titre: Functional Regularized Least Squares Classi cation with Operator-valued Kernels
auteur: Hachem Kadri, Asma Rabaoui, Philippe Preux, Emmanuel Duflos, Alain Rakotomamonjy
article: 28th International Conference on Machine Learning (ICML), Jun 2011, Seattle, United States. pp.993–1000
Accès au texte intégral et bibtex

titre: Multiple functional regression with both discrete and continuous covariates
auteur: Hachem Kadri, Philippe Preux, Emmanuel Duflos, Stéphane Canu
article: 2nd International Workshop on Functional and Operatorial Statistics (IWFOS), Jun 2011, Santander, Spain. pp.189-195
Accès au texte intégral et bibtex

titre: Multi-Sensor PHD: Construction and Implementation by Space Partitioning
auteur: Emmanuel Delande, Emmanuel Duflos, Philippe Vanheeghe, Dominique Heurguier
article: International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, May 2011, Prague, Czech Republic. pp.3632 – 3635, ⟨10.1109/ICASSP.2011.5947137⟩
Accès au texte intégral et bibtex

titre: On selecting the hyperparameters of the DPM models for the density estimation of observation errors
auteur: Asma Rabaoui, Emmanuel Duflos, Juliette Marais, Nicolas Viandier
article: International Conference on Acoustic, Speech and Signal Processing (ICASSP°, May 2011, Prague, Czech Republic. pp.4092-4095, ⟨10.1109/ICASSP.2011.5947252⟩
Accès au bibtex

titre: Impulsive Interference Mitigation in Ad Hoc Networks Based on Alpha-Stable Modeling and Particle Filtering
auteur: Nouha Jaoua, Emmanuel Duflos, Philippe Vanheeghe, Laurent Clavier, François Septier
article: International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, May 2011, Prague, Czech Republic. pp.3548 – 3551, ⟨10.1109/ICASSP.2011.5946244⟩
Accès au texte intégral et bibtex

titre: Learning vocal tract variables with multi-task kernels
auteur: Hachem Kadri, Emmanuel Duflos, Philippe Preux
article: 36th International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2011, Prague, Czech Republic
Accès au texte intégral et bibtex

titre: Handling Expensive Optimization with Large Noise
auteur: Rémi Coulom, Philippe Rolet, Nataliya Sokolovska, Olivier Teytaud
article: Foundations of Genetic Algorithms, Jan 2011, Austria. pp.TBA
Accès au texte intégral et bibtex

titre: Speedy Q-learning
auteur: Mohammad Gheshlaghi Azar, Rémi Munos, Mohammad Ghavamzadeh, Hilbert Kappen
article: Advances in Neural Information Processing Systems, 2011, Spain
Accès au texte intégral et bibtex

titre: Optimistic planning for sparsely stochastic systems
auteur: Lucian Busoniu, Rémi Munos, Bart de Schutter, Robert Babuska
article: IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2011, paris, France. pp.48-55
Accès au texte intégral et bibtex

titre: Finite-sample analysis of Lasso-TD
auteur: Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos, Matt Hoffman
article: International Conference on Machine Learning, 2011, United States
Accès au texte intégral et bibtex

titre: Optimistic optimization of deterministic functions without the knowledge of its smoothness
auteur: Rémi Munos
article: Advances in Neural Information Processing Systems, 2011, Spain
Accès au texte intégral et bibtex

titre: Sparse Recovery with Brownian Sensing
auteur: Alexandra Carpentier, Odalric Maillard, Rémi Munos
article: Advances in Neural Information Processing Systems, 2011, Granada, Spain
Accès au bibtex

Book sections

titre: Bandit view on noisy optimization
auteur: Jean-Yves Audibert, Sébastien Bubeck, Rémi Munos
article: Optimization for Machine Learning, MIT Press, pp.431-454, 2011, 978-0-262-01646-9
Accès au bibtex

titre: Least-squares methods for policy iteration
auteur: Lucian Busoniu, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Robert Babuska, Bart de Schutter
article: Reinforcement Learning: State of the Art, Springer, pp.75-109, 2011
Accès au texte intégral et bibtex

Habilitation à diriger des recherches

titre: LEARNABILITY IN PROBLEMS OF SEQUENTIAL INFERENCE
auteur: Daniil Ryabko
article: Machine Learning [cs.LG]. Université des Sciences et Technologie de Lille – Lille I, 2011
Accès au texte intégral et bibtex

Reports

titre: Multi-Bandit Best Arm Identification
auteur: Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric, Sébastien Bubeck
article: 2011
Accès au texte intégral et bibtex

titre: Reinforcement Learning with a Near Optimal Rate of Convergence
auteur: Mohammad Gheshlaghi Azar, Rémi Munos, Mohammad Ghavamzadeh, Hilbert Kappen
article: [Technical Report] 2011
Accès au texte intégral et bibtex

titre: Automatic motor task selection via a bandit algorithm for a brain-controlled button
auteur: Joan Fruitet, Alexandra Carpentier, Rémi Munos, Maureen Clerc
article: [Rapport de recherche] RR-7721, INRIA. 2011
Accès au texte intégral et bibtex

titre: Transfer from Multiple MDPs
auteur: Alessandro Lazaric, Marcello Restelli
article: [Technical Report] 2011
Accès au texte intégral et bibtex

titre: Classification-based Policy Iteration with a Critic
auteur: Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, Bruno Scherrer
article: 2011
Accès au texte intégral et bibtex

titre: Operator-Valued Kernels for Nonparametric Operator Estimation
auteur: Hachem Kadri, Philippe Preux, Emmanuel Duflos, Stephane Canu
article: [Research Report] RR-7607, INRIA. 2011
Accès au texte intégral et bibtex

titre: Adaptive Bandits: Towards the best history-dependent strategy
auteur: Odalric-Ambrym Maillard, Rémi Munos
article: [Technical Report] 2011, pp.14
Accès au texte intégral et bibtex

Theses

titre: APPRENTISSAGE SÉQUENTIEL : Bandits, Statistique et Renforcement.
auteur: Odalric-Ambrym Maillard
article: Machine Learning [cs.LG]. Université des Sciences et Technologie de Lille – Lille I, 2011. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Active Set Algorithms for the LASSO
auteur: Manuel Loth
article: Machine Learning [cs.LG]. Université des Sciences et Technologie de Lille – Lille I, 2011. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: Nearest Neighbor Clustering: A Baseline Method for Consistent Clustering with Arbitrary Objective Functions
auteur: Sébastien Bubeck, Ulrike von Luxburg
article: 2011
Accès au texte intégral et bibtex

2010

Journal articles

titre: Regret Bounds and Minimax Policies under Partial Monitoring
auteur: Jean-Yves Audibert, Sébastien Bubeck
article: Journal of Machine Learning Research, 2010, 11, pp.2785-2836
Accès au texte intégral et bibtex

titre: Discrimination between B-processes is impossible
auteur: Daniil Ryabko
article: Journal of Theoretical Probability, 2010, 23 (2), pp.565-575. ⟨10.1007/s10959-009-0263-1⟩
Accès au bibtex

titre: Nonparametric Statistical Inference for Ergodic Processes
auteur: Daniil Ryabko, Boris Ryabko
article: IEEE Transactions on Information Theory, 2010, 56 (3), pp.1430-1435. ⟨10.1109/TIT.2009.2039169⟩
Accès au texte intégral et bibtex

titre: On Finding Predictors for Arbitrary Families of Processes
auteur: Daniil Ryabko
article: Journal of Machine Learning Research, 2010, 11, pp.581-602
Accès au texte intégral et bibtex

Conference papers

titre: Planning-based Approach for Optimizing the Display of Online Advertising Campaigns
auteur: Sertan Girgin, Jérémie Mary, Philippe Preux, Olivier Nicol
article: NIPS workshop on Machine Learning in Online ADvertising, Dec 2010, Whistler, Canada
Accès au texte intégral et bibtex

titre: Advertising Campaigns Management: Should We Be Greedy?
auteur: Sertan Girgin, Jérémie Mary, Philippe Preux, Olivier Nicol
article: IEEE International Conference on Data Mining, Dec 2010, Sydney, Australia. pp.821-826
Accès au texte intégral et bibtex

titre: The Iso-regularization Descent Algorithm for the LASSO
auteur: Manuel Loth, Philippe Preux
article: 17th International Conference on Neural Information Processing, Nov 2010, Sidney, Australia
Accès au texte intégral et bibtex

titre: Advanced signal processing techniques for multipath mitigation in land transportation environment
auteur: Juliette Marais, Emmanuel Duflos, Nicolas Viandier, Donnay Fleury Nahimana, Asma Rabaoui
article: International IEEE Conference on Intelligent Transportation Systems (ITSC), Sep 2010, Funchal, France. pp.1480-1485, ⟨10.1109/ITSC.2010.5625065⟩
Accès au bibtex

titre: Feature importance analysis for patient management decisions
auteur: Michal Valko, Milos Hauskrecht
article: 13th International Congress on Medical Informatics MEDINFO 2010, Sep 2010, Cape Town, South Africa. pp.861-865, ⟨10.3233/978-1-60750-588-4-861⟩
Accès au texte intégral et bibtex

titre: GNSS pseudorange error density tracking using Dirichlet Process Mixture
auteur: Nicolas Viandier, Asma Rabaoui, Juliette Marais, Emmanuel Duflos
article: FUSION 2010, Jul 2010, Edinburgh, United Kingdom. pp.1-7
Accès au bibtex

titre: Open Loop Optimistic Planning
auteur: Sébastien Bubeck, Rémi Munos
article: COLT 2010 – The 23rd Conference on Learning Theory, Jun 2010, Haifa, Israel
Accès au texte intégral et bibtex

titre: Best Arm Identification in Multi-Armed Bandits
auteur: Jean-Yves Audibert, Sébastien Bubeck
article: COLT – 23th Conference on Learning Theory – 2010, Jun 2010, Haifa, Israel. 13 p
Accès au texte intégral et bibtex

titre: Simulation-based search of combinatorial games
auteur: Lukasz Lew, Rémi Coulom
article: ICML 2010 : Workshop on Machine Learning and Games, Jun 2010, Haifa, Israel
Accès au bibtex

titre: Analysis of a Classification-based Policy Iteration Algorithm
auteur: Alessandro Lazaric, Mohammad Ghavamzadeh, Remi Munos
article: ICML – 27th International Conference on Machine Learning, Jun 2010, Haifa, Israel. pp.607-614
Accès au texte intégral et bibtex

titre: Bayesian Multi-Task Reinforcement Learning
auteur: Alessandro Lazaric, Mohammad Ghavamzadeh
article: ICML – 27th International Conference on Machine Learning, Jun 2010, Haifa, Israel. pp.599-606
Accès au texte intégral et bibtex

titre: Clustering processes
auteur: Daniil Ryabko
article: 27th International Conference on Machine Learning, Jun 2010, Haifa, Israel. pp.919-926
Accès au texte intégral et bibtex

titre: Finite-Sample Analysis of LSTD
auteur: Alessandro Lazaric, Mohammad Ghavamzadeh, Remi Munos
article: ICML – 27th International Conference on Machine Learning, Jun 2010, Haifa, Israel. pp.615-622
Accès au texte intégral et bibtex

titre: Online Semi-Supervised Perception: Real-Time Learning without Explicit Feedback
auteur: Branislav Kveton, Michal Valko, Mathai Phillipose, Ling Huang
article: 4th IEEE Online Learning for Computer Vision Workshop, Jun 2010, San Francisco, United States. ⟨10.1109/CVPRW.2010.5543877⟩
Accès au texte intégral et bibtex

titre: Online Semi-Supervised Learning on Quantized Graphs
auteur: Michal Valko, Branislav Kveton, Huang Ling, Ting Daniel
article: Uncertainty in Artificial Intelligence, Jun 2010, Catalina Island, United States
Accès au texte intégral et bibtex

titre: Semi-Supervised Learning with Max-Margin Graph Cuts
auteur: Branislav Kveton, Michal Valko, Ali Rahimi, Ling Huang
article: International Conference on Artificial Intelligence and Statistics, May 2010, Chia Laguna, Sardinia, Italy
Accès au texte intégral et bibtex

titre: Studies on DPM for the density estimation of pseudorange noises and evaluations on real data
auteur: Juliette Marais, Asma Rabaoui, Emmanuel Duflos
article: Position Location and Navigation Symposium (PLANS), 2010 IEEE/ION, May 2010, Indian Wells, CA, USA, United States. pp.1154-1161, ⟨10.1109/PLANS.2010.5507234⟩
Accès au bibtex

titre: Belief Function Based Algorithm for Material Detection and Tracking in Construction
auteur: Emmanuel Duflos, Philippe Vanheeghe, Saiedeh Razavi, Carl Haas
article: BELIEF 2010 : Workshop on the Theory of Belief Functions, Apr 2010, Brest, France. CDROM – 6 p
Accès au bibtex

titre: Affichage de publicités sur des portails web
auteur: Victor Gabillon, Jérémie Mary, Philippe Preux
article: Extraction, Gestion des Connaissances (EGC), Jan 2010, Tunisie. pp.110-120
Accès au texte intégral et bibtex

titre: Scrambled Objects for Least-Squares Regression
auteur: Odalric Maillard, Rémi Munos
article: Advances in Neural Information Processing Systems, 2010, Granada, Spain
Accès au bibtex

titre: Online Learning in Adversarial Lipschitz Environments
auteur: Odalric Maillard, Rémi Munos
article: European Conference on Machine Learing, 2010, Barcelone, Spain
Accès au texte intégral et bibtex

titre: Error propagation for approximate policy and value iteration
auteur: Amir Massoud Farahmand, Rémi Munos, Csaba Szepesvari
article: Advances in Neural Information Processing Systems, 2010, Canada
Accès au texte intégral et bibtex

titre: Finite-Sample Analysis of Bellman Residual Minimization
auteur: Odalric-Ambrym Maillard, Rémi Munos, Alessandro Lazaric, Mohammad Ghavamzadeh
article: Asian Conference on Machine Learning, 2010, Japan
Accès au texte intégral et bibtex

titre: Nonlinear functional regression: a functional RKHS approach
auteur: Hachem Kadri, Emmanuel Duflos, Philippe Preux, Stephane Canu, Manuel Davy
article: Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS’10), 2010, Italy. pp.374-380
Accès au texte intégral et bibtex

titre: Sequence prediction in realizable and non-realizable cases
auteur: Daniil Ryabko
article: Conference on Learning Theory, 2010, Haifa, Israel. pp.119-131
Accès au texte intégral et bibtex

titre: Testing composite hypotheses about discrete-valued stationary processes
auteur: Daniil Ryabko
article: IEEE Information Theory Workshop, 2010, Cairo, Egypt. pp.291-295
Accès au bibtex

titre: LSTD with Random Projections
auteur: Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric Maillard, Rémi Munos
article: Advances in Neural Information Processing Systems, 2010, Granada, Spain
Accès au bibtex

Book sections

titre: Approximate Dynamic Programming
auteur: Rémi Munos
article: Olivier Sigaud and Olivier Buffet. Markov Decision Processes in Artificial Intelligence, ISTE Ltd and John Wiley & Sons Inc, pp.67–98, 2010
Accès au bibtex

titre: Robust Unsupervised Speaker Segmentation for Audio Diarization
auteur: Kadri Hachem, Manuel Davy, Noureddine Ellouze
article: Signal Processing, INTECH, pp.307-320, 2010
Accès au texte intégral et bibtex

titre: A comparison of two machine learning approaches for Photometric Solids Compression
auteur: Delepoulle Samuel, François Rouselle, Renaud Christophe, Philippe Preux
article: Plemenos, Dimitri; Miaoulis, Georgios. Intelligent Computer Graphics, 321, Springer, pp.145-164, 2010, Studies in Computational Intelligence
Accès au bibtex

Documents associated with scientific events

titre: Finite sample analysis of Least Squares Temporal Differences
auteur: Rémi Munos
article: Journées MAS et Journée en l’honneur de Jacques Neveu, Aug 2010, Talence, France
Accès au texte intégral et bibtex

Reports

titre: Linear regression with random projections
auteur: Odalric-Ambrym Maillard, Rémi Munos
article: [Technical Report] 2010, pp.22
Accès au texte intégral et bibtex

titre: LSPI with Random Projections
auteur: Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric Maillard, Rémi Munos
article: [Technical Report] 2010
Accès au texte intégral et bibtex

titre: Finite-Sample Analysis of Least-Squares Policy Iteration
auteur: Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos
article: [Technical Report] 2010
Accès au texte intégral et bibtex

titre: Advertising Campaigns Management: Should We Be Greedy?
auteur: Sertan Girgin, Jérémie Mary, Philippe Preux, Olivier Nicol
article: [Research Report] RR-7388, INRIA. 2010, pp.27
Accès au texte intégral et bibtex

titre: Multi-target PHD filtering: proposition of extensions to the multi-sensor case
auteur: Emmanuel Delande, Emmanuel Duflos, Dominique Heurguier, Philippe Vanheeghe
article: [Research Report] RR-7337, INRIA. 2010, pp.64
Accès au texte intégral et bibtex

titre: Brownian Motions and Scrambled Wavelets for Least-Squares Regression
auteur: Odalric-Ambrym Maillard, Rémi Munos
article: [Technical Report] 2010, pp.13
Accès au texte intégral et bibtex

Theses

titre: Bandits Games and Clustering Foundations
auteur: Sébastien Bubeck
article: Statistics [math.ST]. Université des Sciences et Technologie de Lille – Lille I, 2010. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: Pure Exploration for Multi-Armed Bandit Problems
auteur: Sébastien Bubeck, Rémi Munos, Gilles Stoltz
article: 2010
Accès au texte intégral et bibtex

2009

Journal articles

titre: Natural Actor-Critic Algorithms
auteur: Shalabh Bhatnagar, Richard Sutton, Mohammad Ghavamzadeh, Mark Lee
article: Automatica, 2009, 45 (11), ⟨10.1016/j.automatica.2009.07.008⟩
Accès au texte intégral et bibtex

titre: Radar Optimal Times Detection Allocation in Multitarget Environment
auteur: Marie de Vilmorin, Emmanuel Duflos, Philippe Vanheeghe
article: IEEE Systems Journal, 2009, Systems Journal, IEEE, 3 (2), pp.210-220. ⟨10.1109/JSYST.2009.2017393⟩
Accès au texte intégral et bibtex

titre: Le jeu de go et la révolution de Monte Carlo
auteur: Rémi Coulom
article: Interstices, 2009
Accès au bibtex

titre: Asymptotically Optimal Perfect Steganographic Systems
auteur: Boris Ryabko, Daniil Ryabko
article: Problems of Information Transmission, 2009, 45 (2), pp.184-190. ⟨10.1134/S0032946009020094⟩
Accès au bibtex

titre: Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
auteur: Jean-Yves Audibert, Remi Munos, Csaba Szepesvari
article: Theoretical Computer Science, 2009, 410 (19), pp.1876–1902. ⟨10.1016/j.tcs.2009.01.016⟩
Accès au bibtex

titre: Using Data Compressors to Construct Rank Tests
auteur: Daniil Ryabko, Juergen Schmidhuber
article: Applied Mathematics Letters, 2009, 22 (7), pp.1029-1032
Accès au texte intégral et bibtex

titre: Multifractal Random Walks as Fractional Wiener Integrals
auteur: Patrice Abry, Pierre Chainais, Laure Coutin, Vladas Pipiras
article: IEEE Transactions on Information Theory, 2009, 55 (8), pp.3825-3846. ⟨10.1109/TIT.2009.2023708⟩
Accès au texte intégral et bibtex: $https://hal.science/hal-00808604/file/multifractal-fbm-rev14.pdf$

Conference papers

titre: ECON: a Kernel Basis Pursuit Algorithm with Automatic Feature Parameter Tuning, and its Application to Photometric Solids Approximation
auteur: Loth Manuel, Preux Philippe, Delepoulle Samuel, Renaud Christophe
article: International Conference on Machine Learning and Applications, Dec 2009, Miami, United States
Accès au texte intégral et bibtex

titre: Compressed Least-Squares Regression
auteur: Odalric-Ambrym Maillard, Rémi Munos
article: NIPS 2009, Dec 2009, Vancouver, Canada
Accès au texte intégral et bibtex

titre: On the use of Dirichlet process mixtures for the modelling of pseudorange errors in multi-constellation based localisation
auteur: Asma Rabaoui, Nicolas Viandier, Juliette Marais, Emmanuel Duflos
article: International Conference on Intelligent Transport Systems Telecommunications, (ITST), Oct 2009, Lille, France. pp.465-470, ⟨10.1109/ITST.2009.5399308⟩
Accès au bibtex

titre: Enhancement of Galileo and multi-constellation accuracy by modeling pseudorange noises
auteur: Nicolas Viandier, Asma Rabaoui, Juliette Marais, Emmanuel Duflos
article: Intelligent Transport Systems Telecommunications, (ITST), Oct 2009, Lille, France. pp.459-464, ⟨10.1109/ITST.2009.5399311⟩
Accès au bibtex

titre: Real world implementation of belief function theory to detect dislocation of materials in construction
auteur: Saiedeh Razavi, Carl Haas, Philippe Vanheeghe, Emmanuel Duflos
article: FUSION 2009, Jul 2009, Seattle, WA, United States. pp.748-755
Accès au bibtex

titre: Hybrid Stochastic-Adversarial On-line Learning
auteur: Lazaric Alessandro, Rémi Munos
article: COLT 2009 – 22nd Conference on Learning Theory, Jun 2009, Montreal, Canada
Accès au texte intégral et bibtex

titre: Minimax policies for adversarial and stochastic bandits
auteur: Jean-Yves Audibert, Sébastien Bubeck
article: COLT, Jun 2009, Montreal, Canada. pp.217-226
Accès au texte intégral et bibtex

titre: Feature Discovery in Approximate Dynamic Programming
auteur: Philippe Preux, Sertan Girgin, Manuel Loth
article: Approximate Dynamic Programming and Reinforcement Learning, Mar 2009, Nashville, United States
Accès au bibtex

titre: Characterizing predictable classes of processes
auteur: Daniil Ryabko
article: UAI, 2009, Montreal, Canada. pp.471-478
Accès au texte intégral et bibtex

titre: Sensitivity analysis in HMMs with application to likelihood maximization
auteur: Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos
article: Advances in Neural Information Processing Systems, 2009, Canada
Accès au texte intégral et bibtex

titre: Using Kolmogorov Complexity for Understanding Some Limitations on Steganography
auteur: Boris Ryabko, Daniil Ryabko
article: IEEE International Symposium on Information Theory, 2009, seoul, South Korea. pp.2733-2736
Accès au bibtex

titre: An impossibility result for process discrimination
auteur: Daniil Ryabko
article: International Symposium on Information Theory, IEEE, 2009, Seoul, South Korea. pp.1734-1738
Accès au texte intégral et bibtex

Book sections

titre: Light Source Storage and Interpolation for Global Illumination: a neural solution
auteur: Delepoulle Samuel, Renaud Christophe, Philippe Preux
article: Dimitri Plemenos, Georgios Miaoulis. Intelligent Computer Graphics, 240, Springer, pp.87-104, 2009, Studies in Computational Intelligence
Accès au bibtex

Books

titre: Recent Advances in Reinforcement Learning
auteur: Sertan Girgin, Manuel Loth, Rémi Munos, Philippe Preux, Daniil Ryabko
article: Springer, Lectures Notes in Artificial Intelligence (LNAI), vol. 5323, pp.281, 2009
Accès au bibtex

Reports

titre: A criterion for hypothesis testing for stationary processes
auteur: Daniil Ryabko
article: [Research Report] INRIA Lille. 2009
Accès au texte intégral et bibtex

titre: General Framework for Nonlinear Functional Regression with Reproducing Kernel Hilbert Spaces
auteur: Hachem Kadri, Emmanuel Duflos, Manuel Davy, Philippe Preux, Stephane Canu
article: [Research Report] RR-6908, INRIA. 2009
Accès au texte intégral et bibtex

2008

Journal articles

titre: Using One-Class SVMs and Wavelets for Audio Surveillance
auteur: Asma Rabaoui, Manuel Davy, Stéphane Rossignol, Noureddine Ellouze
article: IEEE Transactions on Information Forensics and Security, 2008, 3 (4), pp.763-775. ⟨10.1109/TIFS.2008.2008216⟩
Accès au bibtex

titre: On the Possibility of Learning in Reactive Environments with Arbitrary Dependence
auteur: Daniil Ryabko, Marcus Hutter
article: Theoretical Computer Science, 2008, 405 (3), pp.274-284. ⟨10.1016/j.tcs.2008.06.039⟩
Accès au bibtex

titre: Least committed basic belief density induced by a multivariate Gaussian: Formulation with applications
auteur: Francois Caron, Branko Ristic, Emmanuel Duflos, Philippe Vanheeghe
article: International Journal of Approximate Reasoning, 2008, 48 (2), pp.419-436. ⟨10.1016/j.ijar.2006.10.003⟩
Accès au bibtex

titre: Predicting Non-Stationary Processes
auteur: Daniil Ryabko, Marcus Hutter
article: Applied Mathematics Letters, 2008, 21 (5), pp.477-482. ⟨10.1016/j.aml.2007.04.004⟩
Accès au bibtex

titre: Bayesian Inference for Linear Dynamic Models with Dirichlet Process Mixtures
auteur: François Caron, Manuel Davy, Arnaud Doucet, Emmanuel Duflos, Philippe Vanheeghe
article: IEEE Transactions on Signal Processing, 2008, 56 (1), pp.71-84. ⟨10.1109/TSP.2007.900167⟩
Accès au texte intégral et bibtex

titre: Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
auteur: Andras Antos, Csaba Szepesvari, Rémi Munos
article: Machine Learning, 2008, 71, pp.89-129. ⟨10.1007/s10994-007-5038-2⟩
Accès au texte intégral et bibtex

Conference papers

titre: Incremental Basis Function Expansion in Reinforcement Learning using Cascade-Correlation Networks
auteur: Sertan Girgin, Philippe Preux
article: 8th International Conference on Machine Learning and Applications, Dec 2008, San Diego, United States. pp.75-82
Accès au texte intégral et bibtex

titre: Online Optimization in X-Armed Bandits
auteur: Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvari
article: Twenty-Second Annual Conference on Neural Information Processing Systems, Dec 2008, Vancouver, Canada
Accès au texte intégral et bibtex

titre: Basis Function Construction in Reinforcement Learning using Cascade-Correlation Learning Architecture
auteur: Sertan Girgin, Philippe Preux
article: International Conference on Machine Learning and Applications, Dec 2008, San Diego, United States. pp.75-82
Accès au texte intégral et bibtex

titre: Some sufficient conditions on an arbitrary class of stochastic processes for the existence of a predictor.
auteur: Daniil Ryabko
article: 19th International Conference on Algorithmic Learning Theory, ALT 2008, Oct 2008, Budapest, Hungary. pp.169-182, ⟨10.1007/978-3-540-87987-9_17⟩
Accès au texte intégral et bibtex

titre: Whole-History Rating: A Bayesian Rating System for Players of Time-Varying Strength
auteur: Rémi Coulom
article: Computer and Games, Sep 2008, Beijing, China. pp.113–124
Accès au texte intégral et bibtex

titre: Conditional anomaly detection methods for patient-management alert systems
auteur: Michal Valko, Gregory F. Cooper, Amy Seybert, Shyam Visweswaran, Melissa Saul, Milos Hauskrecht
article: Workshop on Machine Learning in Health Care Applications in The 25th International Conference on Machine Learning, Jul 2008, Helsinki, Finland
Accès au texte intégral et bibtex

titre: Optimal Policies Search for Sensor Management
auteur: Thomas Bréhard, Emmanuel Duflos, Philippe Vanheeghe, Pierre-Arnaud Coquelin
article: FUSION 2008, Jun 2008, Cologne, Germany. pp.1 – 8
Accès au texte intégral et bibtex

titre: Optimal policies search for sensor management : Application to the ESA radar
auteur: Thomas Bréhard, Pierre-Arnaud Coquelin, Emmanuel Duflos, Philippe Vanheeghe
article: 11th International Conference on Information Fusion, 2008., Jun 2008, Cologne, Germany. pp.1 – 8
Accès au bibtex

titre: Reception State Estimation of GNSS satellites in urban environment using particle filtering
auteur: Donnay Fleury Nahimana, Emmanuel Duflos, Juliette Marais
article: FUSION 2008, Jun 2008, Cologne, Germany
Accès au texte intégral et bibtex

titre: Basis Expansion in Natural Actor Critic Methods
auteur: Sertan Girgin, Philippe Preux
article: European Workshop on Reinforcement Learning, Jun 2008, Villeneuve d’Ascq, France. pp.110-123
Accès au texte intégral et bibtex

titre: Distance Metric Learning for Conditional Anomaly Detection
auteur: Michal Valko, Milos Hauskrecht
article: Twenty-First International Florida Artificial Intelligence Research Society Conference, May 2008, Coconut Grove, Florida, United States
Accès au texte intégral et bibtex

titre: Speech recognition with speech density estimation by the dirichlet process mixture
auteur: Kenko Ota, Emmanuel Duflos, Philippe Vanheeghe, Masuzo Yanagida
article: IEEE International Conference on Acoustics, Speech and Signal Processing, 2008. ICASSP 2008., Mar 2008, Las Vegas, United States. pp.1553 – 1556, ⟨10.1109/ICASSP.2008.4517919⟩
Accès au bibtex

titre: Learning predictive models for combinations of heterogeneous proteomic data sources
auteur: Michal Valko, Richard Pelikan, Milos Hauskrecht
article: AMIA Summit on Translational Bioinformatics, Mar 2008, San Francisco, United States
Accès au texte intégral et bibtex

titre: Optimistic planning of deterministic systems
auteur: Jean-Francois Hren, Rémi Munos
article: European Workshop on Reinforcement Learning, 2008, France. pp.151-164
Accès au texte intégral et bibtex

titre: Feature discovery in reinforcement learning using genetic programming
auteur: Sertan Girgin, Philippe Preux
article: 11th European Conference on Genetic Programming (EUROGP), 2008, Naples, Italy. pp.218-229
Accès au texte intégral et bibtex

titre: Infinitely many-armed bandits
auteur: Yizao Wang, Jean-Yves Audibert, Rémi Munos
article: Advances in Neural Information Processing Systems, 2008, Canada
Accès au texte intégral et bibtex

titre: Particle filter-based policy gradient for pomdps
auteur: Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos
article: Advances in Neural Information Processing Systems, 2008, Canada
Accès au texte intégral et bibtex

titre: Adaptative play in texas hold’em poker
auteur: Raphael Maitrepierre, Jérémie Mary, Rémi Munos
article: European Conference on Artificial Intelligence, 2008, France
Accès au texte intégral et bibtex

Book sections

titre: Programmation dynamique avec approximation de la fonction valeur
auteur: Rémi Munos
article: Processus décisionnels de Markov et intelligence artificielle, Hermes, pp.19-50, 2008
Accès au texte intégral et bibtex

Reports

titre: Sensitivity Analysis in Particle Filters. Application to Policy Optimization in POMDPs
auteur: Pierre Arnaud Coquelin, Romain Deguest, Rémi Munos
article: [Research Report] RR-6710, INRIA. 2008
Accès au texte intégral et bibtex

titre: Incremental Basis Function Expansion in Reinforcement Learning using Cascade-Correlation Networks
auteur: Sertan Girgin, Philippe Preux
article: [Research Report] RR-6505, INRIA. 2008
Accès au texte intégral et bibtex

titre: The Equi-Correlation Network: a New Kernelized-LARS with Automatic Kernel Parameters Tuning
auteur: Manuel Loth, Philippe Preux
article: [Research Report] RR-6794, INRIA. 2008
Accès au texte intégral et bibtex

2007

Journal articles

titre: Joint Segmentation of Piecewise Constant Autoregressive Processes by Using a Hierarchical Model and a Bayesian Sampling Approach
auteur: Nicolas Dobigeon, Jean-Yves Tourneret, Manuel Davy
article: IEEE Transactions on Signal Processing, 2007, 55 (4), pp.1251-1263. ⟨10.1109/TSP.2006.889090⟩
Accès au texte intégral et bibtex

titre: Performance Bounds in $L_p$ norm for Approximate Value Iteration
auteur: Rémi Munos
article: SIAM Journal on Control and Optimization, 2007, 46 (2), pp.541-561. ⟨10.1137/040614384⟩
Accès au texte intégral et bibtex

titre: L’Ordinateur, champion de go ?
auteur: Sylvain Gelly, Rémi Munos
article: Pour la science, 2007, 354, pp.28-35
Accès au bibtex

titre: Analyse en norme $L_p$ de l’algorithme d’itérations sur les valeurs avec approximations
auteur: Rémi Munos
article: Revue des Sciences et Technologies de l’Information – Série RIA : Revue d’Intelligence Artificielle, 2007, 21
Accès au texte intégral et bibtex

Conference papers

titre: Consistent Minimization of Clustering Objective Functions
auteur: Ulrike von Luxburg, Sébastien Bubeck, Stefanie Jegelka, Michael Kaufmann
article: Neural Information Processing Systems, Dec 2007, Vancouver, Canada
Accès au texte intégral et bibtex

titre: Monte-Carlo Tree Search in Crazy Stone
auteur: Rémi Coulom
article: 12th Game Programming Workshop, Nov 2007, Hakone, Japan
Accès au bibtex

titre: Computing Elo Ratings of Move Patterns in the Game of Go
auteur: Rémi Coulom
article: Computer Games Workshop, Jun 2007, Amsterdam, Netherlands
Accès au texte intégral et bibtex

titre: A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
auteur: Manuel Loth, Philippe Preux, Manuel Davy
article: European Symposium on Artificial Neural Networks, Apr 2007, Bruges, Belgium, Belgium
Accès au texte intégral et bibtex

titre: A Dynamic Programming Approach to Viability Problems
auteur: Pierre-Arnaud Coquelin, Sophie Martin, Rémi Munos
article: IEEE ADPRL, IEEE Computational Intelligence Society, Apr 2007, Hawai, United States. pp.178-184
Accès au texte intégral et bibtex

titre: Sparse Temporal Difference Learning using LASSO
auteur: Manuel Loth, Manuel Davy, Philippe Preux
article: IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, Apr 2007, Hawaï, USA, United States
Accès au texte intégral et bibtex

titre: Time Allocation of a Set of Radars in a Multitarget Environment
auteur: Emmanuel Duflos, Marie de Vilmorin, Philippe Vanheeghe
article: FUSION 2007, 2007, Québec, Canada
Accès au texte intégral et bibtex

titre: Value-Iteration Based Fitted Policy Iteration: Learning with a Single Trajectory
auteur: Andras Antos, Csaba Szepesvari, Rémi Munos
article: IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, 2007, Hawai, United States. pp.2007
Accès au texte intégral et bibtex

titre: Fitted Q-iteration in continuous action-space MDPs
auteur: Andras Antos, Rémi Munos, Csaba Szepesvari
article: Neural Information Processing Systems, 2007, Vancouver, Canada
Accès au texte intégral et bibtex

titre: Tuning bandit algorithms in stochastic environments
auteur: Jean-Yves Audibert, Rémi Munos, Csaba Szepesvari
article: Algorithmic Learning Theory, 2007, Sendai, Japan. pp.150-165
Accès au texte intégral et bibtex

titre: Bandit Algorithms for Tree Search
auteur: Pierre-Arnaud Coquelin, Rémi Munos
article: Uncertainty in Artificial Intelligence, 2007, Vancouver, Canada
Accès au texte intégral et bibtex

Reports

titre: Finite Time Bounds for Sampling-Based Fitted Value Iteration
auteur: Rémi Munos, Csaba Szepesvari
article: [Research Report] 2007, pp.46
Accès au texte intégral et bibtex

titre: Feature Discovery in Reinforcement Learning using Genetic Programming
auteur: Sertan Girgin, Philippe Preux
article: [Research Report] INRIA. 2007
Accès au texte intégral et bibtex

titre: Optimal Policies Search for Sensor Management : Application to the AESA Radar
auteur: Thomas Bréhard, Pierre-Arnaud Coquelin, Emmanuel Duflos
article: [Research Report] RR-6361, INRIA. 2007, pp.21
Accès au texte intégral et bibtex

titre: Fitted Q-iteration in continuous action-space MDPs
auteur: Andras Antos, Rémi Munos, Csaba Szepesvari
article: [Technical Report] 2007, pp.24
Accès au texte intégral et bibtex

titre: Numerical methods for sensitivity analysis of Feynman-Kac models
auteur: Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos
article: [Research Report] 2007
Accès au texte intégral et bibtex

titre: Bandit Algorithms for Tree Search
auteur: Pierre-Arnaud Coquelin, Rémi Munos
article: [Research Report] RR-6141, INRIA. 2007, pp.20
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: A Dynamic Programming Approach to Viability Problems
auteur: Pierre-Arnaud Coquelin, Sophie Martin, Rémi Munos
article: 2007
Accès au texte intégral et bibtex

titre: Numerical methods for sensitivity analysis of Feynman-Kac models
auteur: Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos
article: 2007
Accès au texte intégral et bibtex

2006

Journal articles

titre: Numerical methods for the pricing of Swing options: a stochastic control approach
auteur: Christophe Barrera-Esteve, Florent Bergeret, Charles H Dossal, Emmanuel Gobet, Asma Meziou, Rémi Munos, Damien Reboul-Salze
article: Methodology and Computing in Applied Probability, 2006, Methodology and Computing in Applied Probability, 8 (4), pp.517-540. ⟨10.1007/s11009-006-0427-8⟩
Accès au texte intégral et bibtex

titre: An Online Support Vector Machine for Abnormal Events Detection
auteur: Manuel Davy, Frederic Desobry, Arthur Gretton, Christian Doncarli
article: Signal Processing, 2006, 86 (8), pp.2009-2025. ⟨10.1016/j.sigpro.2005.09.027⟩
Accès au bibtex

titre: An anti-diffusive scheme for viability problems
auteur: Olivier Bokanowski, Sophie Martin, Rémi Munos, Hasnaa Zidani
article: Applied Numerical Mathematics: an IMACS journal, 2006, 56 (9), pp.1147-1162
Accès au texte intégral et bibtex

titre: Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation
auteur: Rémi Munos
article: Journal of Machine Learning Research, 2006, 7, pp.413-427
Accès au texte intégral et bibtex

titre: Policy Gradient in Continuous Time
auteur: Rémi Munos
article: Journal of Machine Learning Research, 2006, 7, pp.771-791
Accès au texte intégral et bibtex

titre: Bayesian Analysis of Polyphonic western tonal Music
auteur: Manuel Davy, Simon Godsill, Jérôme Idier
article: Journal of the Acoustical Society of America, 2006, 119 (4), pp.2498-2517. ⟨10.1121/1.2168548⟩
Accès au bibtex

Conference papers

titre: A Comparison of Chief Complaints and Emergency Department Reports for Identifying Patients with Acute Lower Respiratory Syndrome
auteur: Wendy Chapman, John Dowling, Gregory F Cooper, Milos Hauskrecht, Michal Valko
article: International Society for Disease Surveillance, Oct 2006, Baltimore, United States
Accès au texte intégral et bibtex

titre: Equi-Gradient Temporal Difference Learning
auteur: Manuel Loth, Manuel Davy, Rémi Coulom, Philippe Preux
article: Kernel Methods and Reinforcement Learning, workshop of ICML 2006, Jun 2006, Pittsburgh, USA, United States
Accès au texte intégral et bibtex

titre: Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
auteur: Andras Antos, Csaba Szepesvari, Rémi Munos
article: Conference On Learning Theory, Jun 2006, Pittsburgh, USA
Accès au texte intégral et bibtex

titre: Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search
auteur: Rémi Coulom
article: 5th International Conference on Computer and Games, May 2006, Turin, Italy
Accès au texte intégral et bibtex

titre: Joint segmentation of piecewise constant autoregressive processes by using a hierarchical model and a Bayesian sampling approach
auteur: Nicolas Dobigeon, Jean-Yves Tourneret, Manuel Davy
article: IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP 2006), May 2006, Toulouse, France. ⟨10.1109/ICASSP.2006.1660575⟩
Accès au texte intégral et bibtex

titre: Application des machines a vecteurs support mono-classe a l’indexation en locuteurs de documents audio
auteur: Belkacem Fergani, Manuel Davy, Amrane Houacine
article: Journees d’Etude sur la Parole 2006, 2006, Dinard, France
Accès au texte intégral et bibtex

titre: ESTIMATION OF MINIMUM MEASURE SETS IN REPRODUCING KERNEL HILBERT SPACES AND APPLICATIONS.
auteur: Manuel Davy, Frederic Desobry, Stephane Canu
article: IEEE ICASSP 2006, 2006, Toulouse, France
Accès au texte intégral et bibtex

titre: MAXIMUM LIKELIHOOD PARAMETER ESTIMATION FOR LATENT VARIABLE MODELS USING SEQUENTIAL MONTE CARLO
auteur: Adam Johansen, Arnaud Doucet, Manuel Davy
article: 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing : conference proceedings, May 14-19, 2006, Toulouse, 2006, Toulouse, France
Accès au texte intégral et bibtex

titre: Bayesian Inference for Dynamic Models with Dirichlet Process Mixtures
auteur: Francois Caron, Manuel Davy, Arnaud Doucet, Emmanuel Duflos, Philippe Vanheeghe
article: 9th IEEE International Conference on Information Fusion, 2006, Florence, Italy
Accès au texte intégral et bibtex

Book sections

titre: Feature Selection and Dimensionality Reduction in Genomics and Proteomics
auteur: Milos Hauskrecht, Richard Pelikan, Michal Valko, James Lyons-Weiler
article: Werner Dubitzky, Martin Granzow and Daniel Berrar. Fundamentals of Data Mining in Genomics and Proteomics, Springer, pp.149-172, 2006, ⟨10.1007/978-0-387-47509-7⟩
Accès au texte intégral et bibtex

Other publications

titre: Use of variance estimation in the multi-armed bandit problem
auteur: Jean-Yves Audibert, Rémi Munos, Csaba Szepesvari
article: 2006
Accès au texte intégral et bibtex

Books

titre: Signal Processing Methods for Music Transcription
auteur: Anssi Klapuri, Manuel Davy
article: Springer, pp.456, 2006, 0-387-30667-6
Accès au bibtex

Reports

titre: Modiﬁcation of UCT with Patterns in Monte-Carlo Go
auteur: Sylvain Gelly, Yizao Wang, Rémi Munos, Olivier Teytaud
article: [Research Report] RR-6062, INRIA. 2006
Accès au texte intégral et bibtex

2005

Master thesis

titre: Evolving Neural Networks for Statistical Decision Theory
auteur: Michal Valko
article: Machine Learning [stat.ML]. 2005
Accès au texte intégral et bibtex