Max Gupta will present his MSE talk "Inductive Biases for Relational Reasoning in Neural Networks" on Thursday 4/23/26 at 9:30am in Friend 004.
Max Gupta will present his MSE talk " Inductive Biases for Relational Reasoning in Neural Networks" on Thursday 4/23/26 at 9:30am in Friend 004. Advisor: Tom Griffiths Reader: Brenden Lake Title: Inductive Biases for Relational Reasoning in Neural Networks Abstract: This thesis examines how the inductive biases of neural networks (architecture, training data, training objective, and weight initialization) affect the emergence of visual relational reasoning capabilities. I present both prior published work and ongoing experimental work under review. By examining the effect of each of these four inductive biases, the thesis aims to disentangle the relative influence of each bias on the relational reasoning abilities of vision models. I cover a series of increasingly large vision models across a suite of increasingly abstract relational reasoning tasks. Along the task complexity axis, I start with the most fundamental relation: same-different reasoning, classifying two objects as 'same' or 'different'. Scaling up task complexity along this axis to higher order relational tasks involves composing intermediate same-different judgements in tasks like odd-one-out and relational match to sample, tasks often used in cognitive science and visual IQ tests. I compare with human data to understand certain perceptual biases towards regularity and simplicity in humans versus neural networks. Throughout, I focus on meta-learning (learning to learn) as a particularly effective training method for endowing neural networks with relational reasoning capabilities. In particular, the episodic partitioning of training data into mini train/test sets enables strong, generalizable relational reasoning even in tiny convolutional networks and at a fraction of the scale of data as pre-trained vision models. Throughout, I employ interpretability techniques for understanding what is going on inside models meta-trained from scratch and inside off-the-shelf pre-trained models. One compelling finding is the recurrence of a two-stage perceptual-to-relational reasoning pipeline that vision models across architectures and scales employ to reason relationally. Moreover, we find that the concepts of 'same' and 'different' are more strongly represented as orthogonal dimensions in smaller, meta-trained models than larger off-the-shelf models like CLIP and DINOv2. In these larger models, I also find correlational evidence that similar attention heads are employed for relational reasoning across this suite of tasks. I highlight this evidence as a promising avenue for future work investigating causal evidence for a shared subspace encoding first and second order relations. Initial results show that this is easiest to do in smaller models for simpler relations like same-different, which live in relatively low-dimensional subspaces. I conclude by addressing the possibility for intervention in these subspaces and downstream consequences for model controllability.
participants (1)
-
Gradinfo