[Ml-stat-talks] Yuval Peres seminars (ORFE/Math) - April 29 and 30
Ramon van Handel
rvan at Princeton.EDU
Mon Apr 21 15:54:50 EDT 2014
Dear all,
Yuval Peres (MSR) will be visiting next week and will be giving two
seminars, the ORFE colloquium on April 29 and a special probability
seminar in Math on April 30. These are not to be missed by anyone
interested in probability, machine learning, TCS, ... Details below!
Best, -- Ramon
(I) ORFE COLLOQUIUM
Tuesday, April 29, 4:30 PM, Sherrerd Hall 101
Speaker: Yuval Peres (MSR)
Title: Bandits with Switching Costs: T^{2/3} Regret
Abstract:
Consider a player selecting, repeatedly, one of two possible actions. In
each round the player learns the loss arising from his action, and his
regret after T rounds is his cumulative loss, minus the loss incurred by
consistently selecting the better of the two actions. It is well known
that in this setting (known as the adversarial two-armed bandit problem)
several algorithms (including a variant of multiplicative weights) can
ensure the player a regret of at most O(T^{1/2}) and this is sharp.
However, if the player incurs a unit cost each time he switches actions
then the best known upper bound on regret was O(T^{2/3}) while the best
lower bound known was still of order T^{1/2}. We resolve this gap, by
proving the upper bound is sharp (up to a log term). In the corresponding
full-information problem, the minimax regret is known to grow at a slower
rate of T^{1/2}. The difference between these two rates indicates that
learning with bandit feedback (i.e. just knowing the loss from the
player's action, not the alternative) can be significantly harder than
learning with full-information feedback. It also shows that without
switching costs, any regret-minimizing algorithm for the bandit problem
must sometimes switch actions very frequently. The proof is based on an
information-theoretic analysis of a loss process arising from a
multi-scale random walk. (Joint work with Ofer Dekel, Jian Ding and Tomer
Koren, to appear in STOC 2014 available at http://arxiv.org/abs/1310.2997)
(II) SPECIAL PROBABILITY SEMINAR
Wednesday, April 30, 2:30 PM, Fine Hall 314
Speaker: Yuval Peres (MSR)
Title: Cover times, blanket times, and the Gaussian free field
Abstract:
The cover time of a finite graph G (the expected time for simple random
walk to visit all vertices) has been extensively studied, yet a number of
fundamental questions concerning cover times have remained open: Aldous
and Fill (1994) asked whether there is a deterministic polynomial-time
algorithm that computes the cover time up to an O(1) factor; Winkler and
Zuckerman (1996) defined the blanket time (when the empirical distribution
of the walk is within a factor of 2, say, of the stationary distribution)
and conjectured that the blanket time is always within O(1) of the cover
time. The best approximation factor found so far for both these problems
was (log log n)^2 for n-vertex graphs, due to Kahn, Kim, Lovasz, and Vu
(2000). We show that the cover time of G, normalized by the number of
edges, is equivalent (up to a universal constant) to the square of the
expected maximum of the Gaussian free field on G. We use this connection
and Talagrand's majorizing measure theory to deduce a positive answer to
the question of Aldous-Fill (1994) and establish the conjecture of
Winkler-Zuckerman (1996). All these results extend to arbitrary reversible
finite Markov chains. Joint work with Jian Ding (U.C. Berkeley) and James
Lee (University of Washington).
More information about the Ml-stat-talks
mailing list