[talks] Berlin Chen will present his General Exam "Inference Efficiency for Sub-quadratic Models" on Friday, January 30, 2026 at 11:00 AM in Friend 108 and via zoom.

29 Jan 2026

      Berlin Chen will present his General Exam "Inference Efficiency for Sub-quadratic Models" on Friday, January 30, 2026 at 11:00 AM in Friend 108 and via zoom. 

Zoom link: https://princeton.zoom.us/j/92265989878 

Committee Members: Tri Dao (advisor), Elad Hazan, Kai Li 

Abstract: 
Recent progress in AI has witnessed a paradigm shift to test-time compute, where LLM inference commands an increasingly greater share of the compute budget. This shift presents a new opportunity for model design that tailors to the computational characteristic of inference. In this talk, I will present a recent work on improving the decoding efficiency of Mamba, which is a variant of subquadratic models based on State Space Models (SSMs). In particular, I will highlight a key challenge preventing subquadratic models from being hardware efficient during decoding. I will then propose an adjustment to the model that addresses the challenge, and demonstrate its key advantages from an inference-first perspective. I will further motivate the change by connecting it to classic SSMs. Drawing upon this connection, I will highlight two additional architectural adjustments to Mamba that are naturally motivated by classic SSM formulations and demonstrate their roles in improving model quality. 

Reading List: 
https://docs.google.com/document/d/1T2z9TI_StxTt9tMNawjSo5842qr8JSBP2zgl7oS3... 

Everyone is invited to attend the talk, and those faculty wishing to remain for the oral exam following are welcome to do so.

[talks] Berlin Chen will present his General Exam "Inference Efficiency for Sub-quadratic Models" on Friday, January 30, 2026 at 11:00 AM in Friend 108 and via zoom.

CS Grad Department