Cyril Zhang will present his Pre FPO "Optimization and Regret Minimization in Stateful Environments" on May 19, 2020 at 2pm.

Zoom link:

https://princeton.zoom.us/j/5554928263?pwd=Zys1NUY5RldIWjZVZVhJYVJCVHcydz09

The members of his committee are as follows:

Advisor: Elad Hazan

Examiners: Sanjeev Arora, Mark Braverman

Readers: Yoram Singer, Karthik Narasimhan

Abstract follows below.

Temporally correlated data and non-convex programs are two core challenges at the frontier of reconciling theory and practice in machine learning. I will present varied perspectives on algorithms and provable learning guarantees in stateful environments, such as those encountered in reinforcement learning, control theory, and time-series analysis. Unified by the framework of online convex optimization, this thesis will consist of two main sections:

- Learning dynamical systems: a line of theoretical work on learning to predict, control, and plan in the presence of time-invariant state transitions, beyond the linear-quadratic-Gaussian assumptions of classical control theory. The regret framework gives us robust algorithms which bypass the usual distributional assumptions and system identification pipelines.

- Optimization in deep learning: a line of empirical work on understanding the behavior of adaptive gradient methods in state-of-the-art deep learning. These poorly-understood heuristics were borne from the online convex world; I will discuss some of the intuitions we can carry over, as well as some of the challenges and insufficiencies.

Finally, I will tie these together with some recent work applying these ideas to my favorite non-convex time series problem: statistical language modeling with deep neural networks.