[Ml-stat-talks] Yuval Peres seminars (ORFE/Math) - April 29 and 30

Ramon van Handel rvan at Princeton.EDU
Mon Apr 21 15:54:50 EDT 2014

Dear all,
Yuval Peres (MSR) will be visiting next week and will be giving two 
seminars, the ORFE colloquium on April 29 and a special probability 
seminar in Math on April 30.  These are not to be missed by anyone 
interested in probability, machine learning, TCS, ...  Details below!
Best, -- Ramon


Tuesday, April 29, 4:30 PM, Sherrerd Hall 101

Speaker:  Yuval Peres (MSR)

Title:  Bandits with Switching Costs: T^{2/3} Regret


Consider a player selecting, repeatedly, one of two possible actions. In 
each round the player learns the loss arising from his action, and his 
regret after T rounds is his cumulative loss, minus the loss incurred by 
consistently selecting the better of the two actions. It is well known 
that in this setting (known as the adversarial two-armed bandit problem) 
several algorithms (including a variant of multiplicative weights) can 
ensure the player a regret of at most O(T^{1/2}) and this is sharp. 
However, if the player incurs a unit cost each time he switches actions 
then the best known upper bound on regret was O(T^{2/3}) while the best 
lower bound known was still of order T^{1/2}. We resolve this gap, by 
proving the upper bound is sharp (up to a log term). In the corresponding 
full-information problem, the minimax regret is known to grow at a slower 
rate of T^{1/2}. The difference between these two rates indicates that 
learning with bandit feedback (i.e. just knowing the loss from the 
player's action, not the alternative) can be significantly harder than 
learning with full-information feedback. It also shows that without 
switching costs, any regret-minimizing algorithm for the bandit problem 
must sometimes switch actions very frequently. The proof is based on an 
information-theoretic analysis of a loss process arising from a 
multi-scale random walk. (Joint work with Ofer Dekel, Jian Ding and Tomer 
Koren, to appear in STOC 2014 available at http://arxiv.org/abs/1310.2997)


Wednesday, April 30, 2:30 PM, Fine Hall 314

Speaker:  Yuval Peres (MSR)

Title:  Cover times, blanket times, and the Gaussian free field


The cover time of a finite graph G (the expected time for simple random 
walk to visit all vertices) has been extensively studied, yet a number of 
fundamental questions concerning cover times have remained open: Aldous 
and Fill (1994) asked whether there is a deterministic polynomial-time 
algorithm that computes the cover time up to an O(1) factor; Winkler and 
Zuckerman (1996) defined the blanket time (when the empirical distribution 
of the walk is within a factor of 2, say, of the stationary distribution) 
and conjectured that the blanket time is always within O(1) of the cover 
time. The best approximation factor found so far for both these problems 
was (log log n)^2 for n-vertex graphs, due to Kahn, Kim, Lovasz, and Vu 
(2000).  We show that the cover time of G, normalized by the number of 
edges, is equivalent (up to a universal constant) to the square of the 
expected maximum of the Gaussian free field on G. We use this connection 
and Talagrand's majorizing measure theory to deduce a positive answer to 
the question of Aldous-Fill (1994) and establish the conjecture of 
Winkler-Zuckerman (1996). All these results extend to arbitrary reversible 
finite Markov chains.  Joint work with Jian Ding (U.C. Berkeley) and James 
Lee (University of Washington).

More information about the Ml-stat-talks mailing list