Kurtland Chua will present his FPO "Analyzing Meta-Learning in Supervised and Reinforcement Learning Settings" on Wednesday, April 30, 2025 at 2:00 PM in CS 401.

Location: CS 401

The members of Kurtland’s committee are as follows:

Examiners: Jason Lee (Adviser), Elad Hazan, Benjamin Eysenbach

Readers: Ryan Adams, Qi Lei (NYU)

A copy of his thesis is available upon request. Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.

Everyone is invited to attend his talk.

Abstract follows below:

Machine learning has recently seen widespread usage of large foundation models which, although highly expressive, are also prone to overfitting when trained on limited domain specific datasets. As such, training on larger but indirectly relevant datasets to extract useful representations, and more generally meta-learning, has proven to be key for using these models in data-sparse settings. This dissertation presents a theoretical study of the assumptions underlying meta-learning, focusing on two settings:

Meta-Supervised Learning: Prior work has analyzed the statistical complexity of learning a fixed representation for several regression tasks. However, methods in practice, including popular gradient-based approaches like MAML, fine-tune representations for each task. As such, we performed the first theoretical study of fine-tuning-based representation learning. Firstly, we derive sample complexity bounds on a representative procedure applied to a generic class of representations. Secondly, we establish a sample complexity separation between using fine-tunable and fixed representations, illustrating when fine-tuning is preferable.

Meta-Reinforcement Learning (RL): Many notions of shared task structure exist for meta-RL, including broadly useful “options” as in hierarchical RL (HRL). However, prior HRL regret bounds assume that the hierarchical structure is known a priori. To bridge this gap, we construct a notion of hierarchical structure that is provably recoverable under appropriate “coverage conditions”. Secondly, we show that the recovered hierarchy can be used to achieve downstream task regret exponentially better than minimax, under mild assumptions. These conditions encode notions such as temporal and state/action abstractions, suggesting that our analysis captures important aspects of HRL in practice.