ECE SEMINAR

Speaker: Dylan Foster, Microsoft Research
Title: Understanding the foundation model pipeline: From imitation to exploration
Day: Monday, March 23, 2026
Time: 4:30 PM
Location: B205 Engineering Quadrangle
Host: Guillermo Sapiro

Abstract:
The prevailing recipe of ever-larger models trained on passively collected data is showing diminishing returns and facing growing constraints on data. To sustain progress, the next phase of AI will hinge on experience: agents and systems that actively curate their own data through interaction. This talk asks: which principles behind today's foundation model pipeline will carry us there, and where do we need new ones?

I. Rethinking the mechanisms behind next-token prediction. Pre-training with next-token prediction is the basis for language modeling, but will it hit a wall as we scale to longer horizons? We will argue that the answer is no, but for reasons that overturn conventional wisdom and point to new interventions.
II. Quantifying what makes a good foundation model. Post-training uses interaction to surface capabilities from the pre-trained model, but when does this amplify existing behaviors versus discover new ones? We introduce the coverage profile, a simple property of the model that characterizes post-training success and failure, and that standard pre-training secretly optimizes for.
III. Revisiting reinforcement learning for the foundation model era. Classical RL offers powerful mechanisms for going beyond what the model already knows, but leveraging them in the foundation model era requires new ideas. We will show that outcome-based methods are limited by coverage, and that overcoming this requires rethinking how models explore.


Bio:
Dylan Foster is a principal researcher at Microsoft Research, New England and New York City. Previously, he was a postdoctoral fellow at MIT IDSS with Sasha Rakhlin, and received his PhD from the Department of Computer Science at Cornell University, advised by Karthik Sridharan. His research develops principles and algorithms for learning from interaction, including reinforcement learning, foundation model training, and interactive decision making. His work has received several awards, including best paper at COLT (2019), best student paper at COLT (2018, 2019), and the Cornell CS doctoral dissertation award.