Zoe Ashwood will present her FPO "Probabilistic Models for Characterizing Animal Learning and Decision-Making" on Monday, July 25, 2022 at 9:00 AM in PNI 101 and Zoom.
The members of Zoe’s committee are as follows:
Examiners: Jonathan Pillow (Adviser), Ryan Adams, Tom Griffiths
Readers: Barbara Engelhardt, Alexandre Pouget (University of Geneva)
A copy of her thesis is available upon request. Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.
Everyone is invited to attend her talk.
Abstract follows below:
A central goal in neuroscience is to identify the strategies used by animals and humans as they make decisions, and to characterize the learning rules that shape these policies in the first place. In this thesis, I discuss three projects aimed at tackling this goal. In the first, we introduce a novel framework, the GLM-HMM, for characterizing mouse and human choice policies during binary decision-making tasks. The GLM-HMM is a hidden Markov model with Bernoulli Generalized Linear Model observations. By fitting the GLM-HMM to hundreds of thousands of decisions, our framework revealed that — contrary to common wisdom — mice and humans use multiple strategies over the course of a single session to perform perceptual decision-making tasks. In the second project, we sought to uncover the learning rules used by mice and rats as they learned to perform this type of task. Our model tracked trial-to-trial changes in the animals’ choice policies, and separated these changes into components explainable by a reinforcement learning rule, and components that remained unexplained. Whereas the average contribution of the conventional REINFORCE learning rule to the policy update for mice learning a common task was just 30%, we found that adding baseline parameters allowed the learning rule to explain 92% of the animals’ policy updates under our model. Finally, I discuss our approach to applying inverse reinforcement learning (IRL) to the trajectories of mice exploring a maze environment. While IRL has been widely applied in robotics and healthcare settings to infer the unknown reward function of an agent, it has yet to be applied extensively in neuroscience. One potential reason for this is that existing IRL frameworks assume that an agent’s reward function is fixed over time. To overcome this limitation, we introduce ‘DIRL’, an IRL framework that allows for time-varying intrinsic rewards. Our method returns interpretable reward functions for two separate cohorts of mice, and provides a novel characterization of exploratory behavior. Overall, we anticipate ‘DIRL’ having broad applicability in neuroscience, and that it could also facilitate the design of biologically-inspired reward functions for training artificial agents to perform analogous tasks.