Talks on interpretability in multi-modal foundation models: Wednesday, 12/10 at 3:30pm
Guillermo Sapiro and Sanketh Vedula will be hosting guests at Princeton Precision Health's office at 252 Nassau St 2nd FL at 3:30pm - 5:00pm on Wednesday, December 10 who will present On decoding the inner workings of multimodal foundation models . Speakers-Alberto Cazzaniga, Lorenzo Basile & Diego Doimo all Researchers from Area Science Park Trieste, Italy. Alberto Cazzaniga will present, On image-text communication in vision language models Vision-language models (VLMs) integrate images and text efficiently, but how they transmit visual information into text generation remains poorly understood. We present two mechanistic findings that clarify image–text communication in modern VLMs. First, using counterfactual multimodal queries, we isolate a small set of attention heads that decide whether the model follows the image or its internal knowledge; editing these heads reliably shifts behavior and reveals the image regions that drive it. Second, comparing native and non-native VLMs, we show that they rely on distinct pathways for visual-to-text transfer: non-native models distribute information across many image tokens, whereas native models depend on a single gate-like token whose removal severely degrades image understanding. Together, these insights offer a clearer and more actionable view of how VLMs process visual evidence. Lorenzo Basile will present, Head Specialization in Vision, Language, and Multimodal Transformers Transformers often appear as opaque systems, but recent research shows that they contain meaningful internal structure. A key example is head specialization, where individual attention heads consistently encode specific concepts across language, vision, and multimodal models. Some heads capture visual properties like shape or color, while others represent linguistic or numerical information such as sentiment or toxic words. Identifying these specialized heads not only deepens interpretability but also provides practical tools for controlling model behavior. By selectively amplifying or suppressing head activity, we can adjust concept representations, adapt models to new tasks, and improve performance with minimal parameter changes. This talk presents methods for discovering head specialization, demonstrates its emergence across diverse architectures, and shows how these structures can be leveraged for precise, parameter-efficient interventions. Diego Doimo will present, The geometry of hidden representations of large transformer models In this talk, we will show how the geometric properties of hidden representations can help us understand the semantic information encoded by transformers. In the first part, we will focus on the intrinsic dimension of the internal transformer representations, showing that it is a valuable tool for identifying the layers encoding the semantic content of data across different domains, such as images, biological sequences, and text. In the second part, we will analyze the probability density of hidden representations. Specifically, we focus on how language models solve a question-answering task with few-shot learning and fine-tuning. We show that while both approaches can achieve similar performance, they create very different density distributions in the hidden representations, changing with a sharp geometrical transition in the middle of the network. For more information, please contact Sanketh Vedula- [ mailto:svedula@princeton.edu | svedula@princeton.edu ] Getting to the seminar space currently requires that you climb a set of stairs. If an accommodation is needed, please contact PPH in advance at: [ mailto:PrincetonPPH@princeton.edu | PrincetonPPH@princeton.edu ] Thank you, Princeton Precision Health
participants (1)
-
Emily C. Lawrence