New subject: UPDATE: Berlin Chen will present his General Exam "Inference Efficiency for Sub-quadratic Models" on Friday, January 30, 2026 at 10:00 AM in CS 301 and via zoom.

29 Jan 2026

      Berlin Chen will present his General Exam "Inference Efficiency for Sub-quadratic Models" on Friday, January 30, 2026 at 11:00 AM in Friend 108 and via zoom. 

Zoom link: https://princeton.zoom.us/j/92265989878 

Committee Members: Tri Dao (advisor), Elad Hazan, Kai Li 

Abstract: 
Recent progress in AI has witnessed a paradigm shift to test-time compute, where LLM inference commands an increasingly greater share of the compute budget. This shift presents a new opportunity for model design that tailors to the computational characteristic of inference. In this talk, I will present a recent work on improving the decoding efficiency of Mamba, which is a variant of subquadratic models based on State Space Models (SSMs). In particular, I will highlight a key challenge preventing subquadratic models from being hardware efficient during decoding. I will then propose an adjustment to the model that addresses the challenge, and demonstrate its key advantages from an inference-first perspective. I will further motivate the change by connecting it to classic SSMs. Drawing upon this connection, I will highlight two additional architectural adjustments to Mamba that are naturally motivated by classic SSM formulations and demonstrate their roles in improving model quality. 

Reading List: 
https://docs.google.com/document/d/1T2z9TI_StxTt9tMNawjSo5842qr8JSBP2zgl7oS3... 

Everyone is invited to attend the talk, and those faculty wishing to remain for the oral exam following are welcome to do so.

Berlin Chen will present his General Exam "Inference Efficiency for Sub-quadratic Models" on Friday, January 30, 2026 at 11:00 AM in Friend 108 and via zoom.

CS Grad Department

gradinfo＠cs.princeton.edu

tags

participants (2)