
Kathryn Wantlin will present her General Exam "Self-Supervised Multi-Task Exploration and RL" on Friday, May 2, 2025 at 3:00 PM in CS 401. Committee Members: Ryan Adams (advisor), Ben Eysenbach, Olga Russakovsky Abstract: Learning from demonstrations to imitate a single behavior in a given environment can require hundreds or thousands of expert (human) demonstrations and learning a distribution of tasks in an environment is even more prohibitively inefficient. This work presents a framework for achieving zero-shot generalization capabilities in a given environment without access to expert data. We replace the imitation learning training phase with a self-supervised exploration phase, training a goal-conditioned critic, policy and variational posterior for inferring goals, and at test time, are able to zero-shot infer a goal from a demonstration and then unroll the associated goal-conditioned policy to imitate the demonstration’s behavior with low regret. We make the key assumption that many behaviors are likely goal-reaching, an inductive bias that inferring an agent’s goal provides a good bias-variance tradeoff compared to inferring their reward function. During the self-supervised exploration phase, I learn temporally contrastive features and utilize several methods for automatic goal selection, comparing these results against oracle goal sampling. Our pipeline is compared against related methods for zero-shot imitation, showing competitive performance against goal-conditioned behavioral cloning across our benchmarked environments. Reading List: https://docs.google.com/document/d/1zhhjPl5Ei1f_awCVUBwduVqi6F1AHZ6iMiJdsiTe... Everyone is invited to attend the talk, and those faculty wishing to remain for the oral exam following are welcome to do so.