[Ml-stat-talks] Emilie Kaufmann's talk

Sebastien Bubeck sbubeck at Princeton.EDU
Tue May 28 15:37:40 EDT 2013


=== Wilks Statistics Seminar ===

DATE:   Wednesday, May 29th

TIME:   12:30pm

LOCATION:   Sherrerd Hall 101

SPEAKER:   Emilie Kaufmann,Telecom ParisTech

TITLE:   Bayesian and frequentist methods in bandit models

ABSTRACT:  A stochastic (multi-armed) bandit model is a simple setup where an agent
interacts with a set of K unknown probability distributions (or 'arms'). One can think
of K slot machines or 'one-armed bandit'. When the agent draws an arm, he observes a
sample of the distribution of this arm. This sample is sometimes called 'reward' when
the agent's goal is to maximize the sum of rewards accumulated while drawing
sequentially the arms. This 'regret minimization' goal makes sense in many applications,
starting with medical allocations that motivated the early study of bandit models. An
alternative goal, called 'pure-exploration' is to find as quickly as possible the (m)
best arm(s), without suffering a loss when drawing bad arms. In this talk, we
present improved algorithms for these two problems in the parametric case, where each
distribution is characterized by a fixed, unknown, parameter. Some improvements rely on
the use of refined confidence intervals based on Kullback-Leibler divergence, others on
exploiting a posterior distribution on the arms, that is using Bayesian tools to solve a
frequentist problem.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/ml-stat-talks/attachments/20130528/fae7dd94/attachment.html>


More information about the Ml-stat-talks mailing list