Contextual Bandit: from Theory to Applications. - Vernade - Workshop 3 - CEB T1 2019

  Рет қаралды 10,927

Institut Henri Poincaré

Institut Henri Poincaré

Күн бұрын

Claire Vernade (Google Deepmind) / 05.04.2019
Contextual Bandit: from Theory to Applications.
Trading exploration versus exploration is a key problem in computer science: it is about learning how to make decisions in order to optimize a long-term cost. While many areas of machine learning aim at estimating a hidden function given a dataset, reinforcement learning is rather about optimally building a dataset of observations of this hidden function that contains just enough information to guarantee that the maximum is being properly estimated. The first part of this talk reviews the main techniques and results known on the contextual linear bandit. We'll mostly rely on the recent book of Lattimore and Szepesvari (2019) [1]. Indeed, real-world problems often don't behave as the theory would like them to. In the second part of this talk, we want to share our experience in applying bandit algorithms in industry [2]. In particular, it appears that while the system is supposed to be interacting with its environment, the customers' feedback is often delayed or missing and does not allow to perform the necessary updates. We propose a solution to this issue, propose some alternative models and architecture, and finish the presentation with open questions on sequential learning beyond bandits.
[1] Lattimore, Tor, and Csaba Szepesvári. Bandit algorithms. preprint (2018).
[2] Vernade, Claire, et al. Contextual bandits under delayed feedback. arXiv preprint arXiv:1807.02089 (2018)
----------------------------------
Vous pouvez nous rejoindre sur les réseaux sociaux pour suivre nos actualités.
Facebook : / instituthenripoincare
Twitter : / inhenripoincare
Instagram : / instituthenripoincare
*************************************
Langue : Anglais; Date : 05.04.2019; Conférencier : Vernade, Claire; Évenement : Workshop 3 - CEB T1 2019; Lieu : IHP; Mots Clés :

Пікірлер: 2
@pk_1320
@pk_1320 2 жыл бұрын
great presentation!!
@irshviralvideo
@irshviralvideo 3 жыл бұрын
LINK TO SLIDES PLZ !
On the Global Convergence of Gradient Descent for (...) - Bach - Workshop 3 - CEB T1 2019
1:00:52
iPhone or Chocolate??
00:16
Hungry FAM
Рет қаралды 49 МЛН
ЭТО НАСТОЯЩАЯ МАГИЯ😬😬😬
00:19
Chapitosiki
Рет қаралды 3,4 МЛН
How Strong is Tin Foil? 💪
00:26
Preston
Рет қаралды 150 МЛН
Catching and reversing a quantum jump mid-flight
47:23
Institut Henri Poincaré
Рет қаралды 2,6 М.
CS885 Lecture 8b: Bayesian and Contextual Bandits
1:17:00
Pascal Poupart
Рет қаралды 13 М.
An introduction to perfectoid spaces and the tilting correspondence
56:53
Institut Henri Poincaré
Рет қаралды 1,6 М.
Machine learning - Bayesian optimization and multi-armed bandits
1:20:30
Nando de Freitas
Рет қаралды 130 М.
The Contextual Bandits Problem: A New, Fast, and Simple Algorithm
1:00:56
Microsoft Research
Рет қаралды 13 М.
Bandit Algorithms - 1
1:34:05
ICTP Quantitative Life Sciences
Рет қаралды 10 М.
The Contextual Bandits Problem
54:29
Simons Institute
Рет қаралды 23 М.
iPhone or Chocolate??
00:16
Hungry FAM
Рет қаралды 49 МЛН