[Ml-stat-talks] Akshay Krishnamurthy talk on June 1st

Barbara Engelhardt bee at princeton.edu
Sat May 28 20:23:00 EDT 2016


Talk of interest, hosted by Han Liu:

Title: Contextual-MDPs: A new model and analysis for reinforcement learning
with rich observations

Abstract:

State-of-the-art reinforcement learning systems cannot solve hard problems
because they lack systematic exploration mechanisms, instead relying on
variants of uniform exploration to collect diverse experience. Systematic
exploration mechanism exist for Markov Decision Processes (MDPs) but have
sample complexity that scales polynomially with the number of unique
observations, making them intractable for modern reinforcement learning
applications where observations come from a visual sensor. Are there
reinforcement learning algorithms that can effectively handle rich
(high-dimensional, infinite) observations by engaging in systematic
exploration? To aid in the development of such an algorithm, I will first
describe a new model for reinforcement learning with rich observations that
we call the Contextual-MDP. This model generalizes both stochastic
contextual bandits and MDPs, but is considerably more tractable than
Partially Observable Markov Decision Processes (POMDPs). I will then
describe a new algorithm for learning optimal behavior in Contextual-MDPs.
This algorithm engages in global exploration while using a function class
to approximate future performance and has a sample complexity that scales
polynomially in all relevant parameters while being independent of the
number of unique observations. This represents an exponential improvement
on all existing approaches and provides theoretical justification for
reinforcement learning with function approximation.

Bio

Akshay Krishnamurthy is a postdoctoral researcher at Microsoft Research,
New York City. In June 2015, he completed his PhD in the Computer Science
Department at Carnegie Mellon University, advised by Aarti Singh. His
research focuses on interactive learning and learning settings involving
feedback-driven data collection, including reinforcement learning,
interactive solutions for discovering low-dimensional structure, and
complex prediction problems with limited feedback. He has also worked on
problems in nonparametric statistics, anomaly detection, and compressive
sensing. Starting Fall 2016, he will join the Department of Computer
Science at the University of Massachusetts, Amherst as an assistant
professor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/ml-stat-talks/attachments/20160528/1dd54b9f/attachment.html>


More information about the Ml-stat-talks mailing list