CS Colloquium speakers: week of February 26

CS Colloquium speakers Speaker: Jialin Ding, Amazon Web Services Date: Monday, February 26 Time: 12:30pm EST Location: CS 105 Host: Wyatt Lloyd Event page: https://www.cs.princeton.edu/events/26578 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_10z_l8MYRr21ZOuPqA8Odw Title: Instance-Optimization: Rethinking Database Design for the Next 1000X Abstract: Modern database systems aim to support a large class of different use cases while simultaneously achieving high performance. However, as a result of their generality, databases often achieve adequate performance for the average use case but do not achieve the best performance for any individual use case. In this talk, I will describe my work on designing databases that use machine learning and optimization techniques to automatically achieve performance much closer to the optimal for each individual use case. In particular, I will present my work on instance-optimized database storage layouts, in which the co-design of data structures and optimization policies improves query performance in analytic databases by orders of magnitude. I will highlight how these instance-optimized data layouts address various challenges posed by real-world database workloads and how I implemented and deployed them in production within Amazon Redshift, a widely-used commercial database system. Bio: Jialin Ding is an Applied Scientist at AWS. Before that, he received his PhD in computer science from MIT, advised by Tim Kraska. He works broadly on applying machine learning and optimization techniques to improve data management systems, with a focus on building databases that automatically self-optimize to achieve high performance for any specific application. His work has appeared in top conferences such as SIGMOD, VLDB, and CIDR, and has been recognized by a Meta Research PhD Fellowship. To learn more about Jialin’s work, please visit [ https://jialinding.github.io/ | https://jialinding.github.io/ ] . CSML/CS/PSY Colloquium Speaker: Brenden Lake, New York University Date: Tuesday, February 27 Time: 12:30pm EST Location: CS 105 Host: Tom Griffiths Event page: https://csml.princeton.edu/events/csmlpsycs-seminar Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_aUfLfVQFSIqV2RDSYu4big Title: Towards more human-like learning in machines: Bridging the data and generalization gaps Abstract: There is an enormous data gap between how AI systems and children learn: The best LLMs now learn language from text with a word count in the trillions, whereas it would take a child roughly 100K years to reach those numbers through speech (Frank, 2023, "Bridging the data gap"). There is also a clear generalization gap: whereas machines struggle with systematic generalization, children can excel. For instance, once a child learns how to "skip," they immediately know how to "skip twice" or "skip around the room with their hands up" due to their compositional skills. In this talk, I'll describe two case studies in addressing these gaps. 1) The data gap: We train deep neural networks from scratch (using DINO, CLIP, etc.), not on large-scale data from the web, but through the eyes and ears of a single child. Using head-mounted video recordings from a child as training data (<200 hours of video slices over 26 months), we show how deep neural networks can perform challenging visual tasks, acquire many word-referent mappings, generalize to novel visual referents, and achieve multi-modal alignment. Our results demonstrate how today's AI models are capable of learning key aspects of children's early knowledge from realistic input. 2) The generalization gap: Can neural networks capture human-like systematic generalization? We address a 35-year-old debate catalyzed by Fodor and Pylyshyn's classic article, which argued that standard neural networks are not viable models of the mind because they lack systematic compositionality -- the algebraic ability to understand and produce novel combinations from known components. We'll show how neural networks can achieve human-like systematic generalization when trained through meta-learning for compositionality (MLC), a new method for optimizing the compositional skills of neural networks through practice. With MLC, neural networks can match human performance and solve several machine learning benchmarks. Given these findings, we'll discuss the paths forward for building machines that learn, generalize, and interact in more human-like ways based on more natural input. Related articles: Vong, W. K., Wang, W., Orhan, A. E., and Lake, B. M (2024). Grounded language acquisition through the eyes and ears of a single child. Science, 383, 504-511. Orhan, A. E., and Lake, B. M. (in press). Learning high-level visual representations from a child’s perspective without strong inductive biases. Nature Machine Intelligence. Lake, B. M. and Baroni, M. (2023). Human-like systematic generalization through a meta-learning neural network. Nature, 623, 115-121. Bio: Brenden M. Lake is an Assistant Professor of Data Science and Psychology at New York University. He received his M.S. and B.S. in Symbolic Systems from Stanford University in 2009, and his Ph.D. in Cognitive Science from MIT in 2014. He was a postdoctoral Data Science Fellow at NYU from 2014-2017. Brenden is a recipient of the Robert J. Glushko Prize for Outstanding Doctoral Dissertation in Cognitive Science and the MIT Technology Review 35 Innovators Under 35. His research was also selected by Scientific American as one of the 10 most important advances of 2016. Brenden's research focuses on computational problems that are easier for people than they are for machines, such as learning new concepts from just a few examples, learning by asking questions, learning by generating new goals, and learning by producing novel combinations of known components. CS Colloquium Speaker: Eric Mitchell, Stanford University Date: Thursday, February 29 Time: 12:30pm EST Location: CS 105 Host: Karthik Narasimhan Event page: https://www.cs.princeton.edu/events/26587 Register for live-stream online here: TBA Talk info TBA

CS Colloquium speaker Speaker: Jialin Ding, Amazon Web Services Date: Monday, February 26 Time: 12:30pm EST Location: CS 105 Host: Wyatt Lloyd Event page: https://www.cs.princeton.edu/events/26578 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_10z_l8MYRr21ZOuPqA8Odw Title: Instance-Optimization: Rethinking Database Design for the Next 1000X Abstract: Modern database systems aim to support a large class of different use cases while simultaneously achieving high performance. However, as a result of their generality, databases often achieve adequate performance for the average use case but do not achieve the best performance for any individual use case. In this talk, I will describe my work on designing databases that use machine learning and optimization techniques to automatically achieve performance much closer to the optimal for each individual use case. In particular, I will present my work on instance-optimized database storage layouts, in which the co-design of data structures and optimization policies improves query performance in analytic databases by orders of magnitude. I will highlight how these instance-optimized data layouts address various challenges posed by real-world database workloads and how I implemented and deployed them in production within Amazon Redshift, a widely-used commercial database system. Bio: Jialin Ding is an Applied Scientist at AWS. Before that, he received his PhD in computer science from MIT, advised by Tim Kraska. He works broadly on applying machine learning and optimization techniques to improve data management systems, with a focus on building databases that automatically self-optimize to achieve high performance for any specific application. His work has appeared in top conferences such as SIGMOD, VLDB, and CIDR, and has been recognized by a Meta Research PhD Fellowship. To learn more about Jialin’s work, please visit [ https://jialinding.github.io/ | https://jialinding.github.io/ ] .

CS Colloquium speaker Speaker: Eric Mitchell, Stanford University Date: Thursday, February 29 Time: 12:30pm EST Location: CS 105 Host: Karthik Narasimhan Event page: [ https://www.cs.princeton.edu/events/26587 | https://www.cs.princeton.edu/events/26587 ] Register for live-stream online here: [ https://princeton.zoom.us/webinar/register/WN_d0Ueg7lYSzmzHvxwq-FgaA | https://princeton.zoom.us/webinar/register/WN_d0Ueg7lYSzmzHvxwq-FgaA ] Title: Making Language Models Useful Abstract: Large pre-trained language models, most notably GPT-3, are the engines of knowledge and capability underpinning powerful systems such as ChatGPT, Gemini, and Claude. Yet much like building a safe, comfortable vehicle requires more than a powerful engine, building a useful, beneficial language system requires additional techniques to promote key attributes such as controllability, factuality, and updatability. This talk will share my work towards imbuing large language models with these traits. I will first share the direct preference optimization algorithm, a more scalable algorithm for training language models to follow instructions in accordance with human preferences. I will next discuss approaches for improving the factual reliability of language models, which is challenging even for models that generally follow user instructions well. Finally, I will share my work towards methods for updating individual model behaviors or beliefs that have fallen out-of-date or are otherwise problematic. I will conclude with several important topics for future work toward more useful, trustworthy AI systems, including unsupervised continual learning, scalable oversight, and robust reasoning. Bio: Eric Mitchell is a final-year PhD student in Stanford’s Computer Science department, advised by Chelsea Finn and Christopher Manning. His research uses tools from machine learning to improve the usefulness and reliability of language models, in particular by developing techniques that enhance their controllability, factuality, and updatability. His work has appeared in ICML, NeurIPS, ICLR, and EMNLP, being recognized with an outstanding paper runner-up award at NeurIPS ‘23. His work, in particular the direct preference optimization algorithm, has been used widely in state-of-the-art open source and proprietary language models. He is a former Knight-Hennessy Scholar and received his BS from Princeton University.

CS Colloquium speakers Speaker: Xiang Lisa Li, Stanford University Date: Monday, February 24 Time: 12:30pm EST Location: CS 105 Host: Sanjeev Arora Event page: https://www.cs.princeton.edu/events/controlling-language-models Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_HscNKtVWSfiVBw0zM6q_ag Title: Controlling Language Models Abstract: Controlling language models is key to unlocking their full potential and making them useful for downstream tasks. Successfully deploying these models often requires both task-specific customization and rigorous auditing of their behavior. In this talk, I will begin by introducing a customization method called Prefix-Tuning, which adapts language models by updating only 0.1% of their parameters. Next, I will address the need for robust auditing by presenting a Frank-Wolfe-inspired algorithm for red-teaming language models, which provides a principled framework for discovering diverse failure modes. Finally, I will rethink the root cause of these control challenges, and propose a new generative model for text, called Diffusion-LM, which is controllable by design. Bio: Xiang Lisa Li is a PhD candidate at Stanford University, where she is advised by Percy Liang and Tatsunori Hashimoto. Her research focuses on developing methods to make language models more capable and controllable. Lisa is supported by the Two Sigma PhD fellowship and Stanford Graduate Fellowship and is the recipient of an EMNLP Best Paper award. Speaker: Zhen Dong, University of California, Berkeley Date: Tuesday, February 25 Time: 12:30pm EST Location: CS 105 Host: Kai Li Event page: https://www.cs.princeton.edu/events/make-ai-more-accessible-and-run-faster Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_yrwYu8XfT2-UxV-2RVKGvg Title: Make AI More Accessible and Run Faster Abstract: LLMs and diffusion models have achieved great success in recent years. However, many AI models, particularly those with state-of-the-art performance, have a high computational cost and memory footprint. This impedes the development of pervasive AI in scenarios lacking sufficient computational resources (e.g., IoT devices, lunar rovers), requiring ultra-fast inference (e.g., AI4Science), or demanding real-time interaction under constrained computation (e.g., AR/VR, Embodied AI). Model compression (quantization, pruning, distillation, etc) and hardware-software co-design are promising approaches to achieving Efficient AI, which makes AI more accessible and run faster. In this talk, I will first introduce my work on 1) mixed-precision quantization based on Hessian analysis (HAWQv1v2, ZeroQ, Q-BERT) and 2) hardware-software co-design (HAWQv3, CoDeNet, HAO). Then I will talk about my ongoing and future works in the era of LLMs and GenAI, including SqueezeLLM, Q-Diffusion, efficient AI agent systems, advanced CoT distillation, efficient deep thinking for OpenAI-o1 and Deepseek-R1, etc. My research vision is that efficient AI is becoming indispensable both at the edge where increasingly powerful sensors with diverse modalities generate huge volumes of local data, and in the cloud where reducing costs is essential to bridge the speed gap between inference scaling laws and Moore’s law for hardware. Bio: Dr. Zhen Dong is currently a Postdoc at UC Berkeley. He obtained his Ph.D. from Berkeley AI Research advised by Prof. Kurt Keutzer. Before Berkeley, Zhen received B.E. from Peking University. Zhen’s research focuses on efficient AI, model compression, hardware-software co-design, and AI systems. Zhen has received Berkeley University Fellowship and SenseTime Scholarship. Zhen has published over 10 papers as the first or co-first author at top AI conferences. He won the best paper award at AAAI Practical-DL workshop, and Zhen is also a winner of the DAC 2024 PhD forum and CVPR 2024 doctoral consortium. Speaker: Yuke Zhu, University of Texas, Austin Date: Thursday, February 27 Time: 12:30pm EST Location: CS 105 Host: Jia Deng Event page: https://www.cs.princeton.edu/events/pathway-generalist-robot-autonomy-data-c... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_p4RA6oDFTiCVquo7_GGJdg Title: Pathway to Generalist Robot Autonomy — A Data-Centric Approach Abstract: In an era of rapid AI progress, leveraging accelerated computing and big data has unlocked new possibilities to develop general-purpose AI models. As AI systems like ChatGPT showcase remarkable performance in the digital realm, we are compelled to ask: Can we achieve similar breakthroughs in the physical world — to create generalist robots capable of performing everyday tasks? In this talk, I will present my data-centric research principles and methodologies for building general-purpose robot autonomy in open-world environments. I will discuss my recent work on building compositional robot autonomy stacks with diverse data sources. I will also present a human-in-the-loop framework for trustworthy robot deployment and continual learning. Combining these advances with cutting-edge developments in humanoid robotics, I will outline a roadmap toward the next generation of autonomous robots. Bio: Yuke Zhu is an Assistant Professor in the Computer Science Department of UT-Austin, where he directs the Robot Perception and Learning (RPL) Lab. He also co-leads the Generalist Embodied Agent Research (GEAR) lab at NVIDIA Research, which builds foundation models for embodied agents in virtual and physical worlds, particularly for humanoid robots. He focuses on developing intelligent algorithms for generalist robots and embodied agents to reason about and interact with the real world. His research spans robotics, computer vision, and machine learning. He received his Master's and Ph.D. degrees from Stanford University. His work has won various awards and nominations, including the Best Conference Paper Award in ICRA 2019, 2024, the Outstanding Learning Paper Award at ICRA 2022, and the Outstanding Paper Award at NeurIPS 2022. He received the NSF CAREER Award and faculty awards from Amazon, JP Morgan, and Sony Research.

CS Colloquium Speaker Speaker: Xiang Lisa Li, Stanford University Date: Monday, February 24 Time: 12:30pm EST Location: CS 105 Host: Sanjeev Arora Event page: https://www.cs.princeton.edu/events/controlling-language-models Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_HscNKtVWSfiVBw0zM6q_ag Title: Controlling Language Models Abstract: Controlling language models is key to unlocking their full potential and making them useful for downstream tasks. Successfully deploying these models often requires both task-specific customization and rigorous auditing of their behavior. In this talk, I will begin by introducing a customization method called Prefix-Tuning, which adapts language models by updating only 0.1% of their parameters. Next, I will address the need for robust auditing by presenting a Frank-Wolfe-inspired algorithm for red-teaming language models, which provides a principled framework for discovering diverse failure modes. Finally, I will rethink the root cause of these control challenges, and propose a new generative model for text, called Diffusion-LM, which is controllable by design. Bio: Xiang Lisa Li is a PhD candidate at Stanford University, where she is advised by Percy Liang and Tatsunori Hashimoto. Her research focuses on developing methods to make language models more capable and controllable. Lisa is supported by the Two Sigma PhD fellowship and Stanford Graduate Fellowship and is the recipient of an EMNLP Best Paper award.

CS Colloquium speakers Speaker: Noah Golowich, Massachusetts Institute of Technology Date: Monday, March 3 Time: 12:30pm EST Location: CS 105 Host: Elad Hazan Event page: https://www.cs.princeton.edu/events/theoretical-foundations-multi-agent-lear... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_E0TDd9QdQfi2XAdSueJ_iQ Title: Theoretical Foundations for Multi-Agent Learning Abstract: As learning algorithms become increasingly capable of acting autonomously, it is important to better understand the behavior that results from their interactions. For example, a pervasive challenge in multi-agent learning settings, which spans both theory and practice and dates back decades, has been the failure of convergence for iterative algorithms such as gradient descent. Accordingly, a longstanding central question with broad relevance is: how quickly can we compute solution concepts, i.e., equilibria, in multi-agent settings? I will discuss results which address this question at a variety of levels, starting from foundational settings involving normal-form games and building up to complex problems such as multi-agent reinforcement learning which more aptly model realistic situations. First, I will present a result establishing a near-optimal convergence rate for a simple online learning algorithm in normal-form games, resolving a decade-long line of work which gave suboptimal bounds. I will then discuss a new algorithm for minimizing swap regret exponentially faster than previous approaches. Our algorithm allows us to answer several open questions, such as by establishing the first PTAS for correlated equilibria in extensive-form games. Beyond contending with agents' differing incentives, the increasing use of machine learning algorithms presents other challenges, such as the proliferation of AI-generated content. In the latter part of the talk, I will discuss an approach to detect such content via watermarking. Our watermarking scheme is the first to embed a watermark in a language model's output in a way which only leads to negligible changes in the distribution of the output but which is robust to adversarial edits. Bio: Noah Golowich is a PhD Student at the Massachusetts Institute of Technology, advised by Constantinos Daskalakis and Ankur Moitra. His research interests lie in theoretical machine learning, with a particular focus on the connections between multi-agent learning, game theory, online learning, and theoretical reinforcement learning. He has also worked on CS theory more broadly, including on problems in differential privacy, learning theory, and watermarking. He was supported by a Fannie & John Hertz Foundation Fellowship and an NSF Graduate Fellowship. Speaker: Rose Wang, Stanford University Date: Tuesday, March 4 Time: 12:30pm EST Location: CS 105 Host: Tom Griffiths Event page: https://www.cs.princeton.edu/events/scaling-expertise-language-models-applic... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_8AaPm0K_RCG9PBMzxprLpg Title: Scaling Expertise via Language Models: With Applications to Education Abstract: Access to expertise shapes how individuals learn, develop, and succeed across society. For example, in education, experienced teachers teach students and train novice educators through effective interactions. However, access to expertise is limited, undermining learning at scale. While language models promise to democratize access, they often mimic surface-level patterns and lack the human touch needed to support learners through challenges. In this talk, I will present novel computational methods and interventions that embed expert-like thinking into language models and empower human novices in real-time interactions. First, I will present Bridge, an adaptation method that extracts latent expert reasoning to adapt language models for complex interactions. Then, I will introduce Tutor CoPilot, a novel Human-AI approach that provides expert-like guidance to tutors in real time. In the first randomized controlled trial of a Human-AI system for live tutoring, Tutor CoPilot significantly improves the quality of learning interactions for 900 tutors and 1,800 K-12 students from underserved communities. Bio: Rose E. Wang is a Computer Science PhD candidate at Stanford University. She develops algorithms, benchmarks and large-scale interventions to tackle challenges in real-world interactions, with a focus on Education. Her work is deployed in industry and directly improves the education of under-served students through partnerships she has cultivated during her Ph.D., including Title I school districts and several education companies, impacting 200,000+ students, 1,700+ teachers, 16,100+ tutors, in millions of tutoring sessions across the U.S., UK and India. Her work is recognized by the 2025 Economic Report of the President, NSF Graduate Research Fellowship, CogSci Best Paper Award, NeurIPS Cooperative AI Best Paper Award, ICLR Oral, Rising Star in Data Science, Building Educational Applications Ambassador Paper Award, and the Learning Engineering Tools Competition Award. Speaker: Ruiqi Zhong, University of California, Berkeley Date: Thursday, March 6 Time: 12:30pm EST Location: CS 105 Host: Danqi Chen Event page: https://www.cs.princeton.edu/events/building-expert-level-language-models-de... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_05z1hr3MQKalcALbEK_clg Title: Building Expert-Level Language Models from Decomposed Weak Validations Abstract: Language models (LMs) can process large volumes of information and perform complex reasoning. They hold the promise of executing expert-level tasks, such as brainstorming scientific hypotheses or developing complex software. However, building these LMs requires humans to validate their outputs, which is challenging; e.g., developers cannot easily validate whether complex software is bug-free. If our validation is fallible, LMs may learn to "hack" the validators, convincing us that they are right even when they are wrong. To address this, I show how to decompose complex validation tasks into "weaker" ones that are easier for humans or LMs: e.g., validating return values rather than entire programs, or validating discoveries on individual samples rather than on entire datasets. Through several examples, I show how these techniques allow us to use LMs for expert-level tasks more reliably. Looking forward, I discuss how to use LMs to automate these task decompositions, and how we can use these frameworks to monitor both individual AI systems and their broader impact within society. Bio: Ruiqi Zhong is a final-year Ph.D. student at UC Berkeley, co-advised by Jacob Steinhardt and Dan Klein. He was previously a part-time member of technical staff at Anthropic, where he worked on the automated red teaming team. His research is at the intersection of machine learning and NLP, and he develops language model systems to advance the frontier of human capabilities. He developed the earliest prototype of instruction-tuning, and his research contribution has been scaled up by leading language model companies, such as Google, OpenAI, and Anthropic.

Speaker: Noah Golowich, Massachusetts Institute of Technology Date: Monday, March 3 Time: 12:30pm EST Location: CS 105 Host: Elad Hazan Event page: https://www.cs.princeton.edu/events/theoretical-foundations-multi-agent-lear... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_E0TDd9QdQfi2XAdSueJ_iQ Title: Theoretical Foundations for Multi-Agent Learning Abstract: As learning algorithms become increasingly capable of acting autonomously, it is important to better understand the behavior that results from their interactions. For example, a pervasive challenge in multi-agent learning settings, which spans both theory and practice and dates back decades, has been the failure of convergence for iterative algorithms such as gradient descent. Accordingly, a longstanding central question with broad relevance is: how quickly can we compute solution concepts, i.e., equilibria, in multi-agent settings? I will discuss results which address this question at a variety of levels, starting from foundational settings involving normal-form games and building up to complex problems such as multi-agent reinforcement learning which more aptly model realistic situations. First, I will present a result establishing a near-optimal convergence rate for a simple online learning algorithm in normal-form games, resolving a decade-long line of work which gave suboptimal bounds. I will then discuss a new algorithm for minimizing swap regret exponentially faster than previous approaches. Our algorithm allows us to answer several open questions, such as by establishing the first PTAS for correlated equilibria in extensive-form games. Beyond contending with agents' differing incentives, the increasing use of machine learning algorithms presents other challenges, such as the proliferation of AI-generated content. In the latter part of the talk, I will discuss an approach to detect such content via watermarking. Our watermarking scheme is the first to embed a watermark in a language model's output in a way which only leads to negligible changes in the distribution of the output but which is robust to adversarial edits. Bio: Noah Golowich is a PhD Student at the Massachusetts Institute of Technology, advised by Constantinos Daskalakis and Ankur Moitra. His research interests lie in theoretical machine learning, with a particular focus on the connections between multi-agent learning, game theory, online learning, and theoretical reinforcement learning. He has also worked on CS theory more broadly, including on problems in differential privacy, learning theory, and watermarking. He was supported by a Fannie & John Hertz Foundation Fellowship and an NSF Graduate Fellowship.

Speaker: Rose Wang, Stanford University Date: Tuesday, March 4 Time: 12:30pm EST Location: CS 105 Host: Tom Griffiths Event page: https://www.cs.princeton.edu/events/scaling-expertise-language-models-applic... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_8AaPm0K_RCG9PBMzxprLpg Title: Scaling Expertise via Language Models: With Applications to Education Abstract: Access to expertise shapes how individuals learn, develop, and succeed across society. For example, in education, experienced teachers teach students and train novice educators through effective interactions. However, access to expertise is limited, undermining learning at scale. While language models promise to democratize access, they often mimic surface-level patterns and lack the human touch needed to support learners through challenges. In this talk, I will present novel computational methods and interventions that embed expert-like thinking into language models and empower human novices in real-time interactions. First, I will present Bridge, an adaptation method that extracts latent expert reasoning to adapt language models for complex interactions. Then, I will introduce Tutor CoPilot, a novel Human-AI approach that provides expert-like guidance to tutors in real time. In the first randomized controlled trial of a Human-AI system for live tutoring, Tutor CoPilot significantly improves the quality of learning interactions for 900 tutors and 1,800 K-12 students from underserved communities. Bio: Rose E. Wang is a Computer Science PhD candidate at Stanford University. She develops algorithms, benchmarks and large-scale interventions to tackle challenges in real-world interactions, with a focus on Education. Her work is deployed in industry and directly improves the education of under-served students through partnerships she has cultivated during her Ph.D., including Title I school districts and several education companies, impacting 200,000+ students, 1,700+ teachers, 16,100+ tutors, in millions of tutoring sessions across the U.S., UK and India. Her work is recognized by the 2025 Economic Report of the President, NSF Graduate Research Fellowship, CogSci Best Paper Award, NeurIPS Cooperative AI Best Paper Award, ICLR Oral, Rising Star in Data Science, Building Educational Applications Ambassador Paper Award, and the Learning Engineering Tools Competition Award.

Speaker: Ruiqi Zhong, University of California, Berkeley Date: Thursday, March 6 Time: 12:30pm EST Location: CS 105 Host: Danqi Chen Event page: https://www.cs.princeton.edu/events/building-expert-level-language-models-de... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_05z1hr3MQKalcALbEK_clg Title: Building Expert-Level Language Models from Decomposed Weak Validations Abstract: Language models (LMs) can process large volumes of information and perform complex reasoning. They hold the promise of executing expert-level tasks, such as brainstorming scientific hypotheses or developing complex software. However, building these LMs requires humans to validate their outputs, which is challenging; e.g., developers cannot easily validate whether complex software is bug-free. If our validation is fallible, LMs may learn to "hack" the validators, convincing us that they are right even when they are wrong. To address this, I show how to decompose complex validation tasks into "weaker" ones that are easier for humans or LMs: e.g., validating return values rather than entire programs, or validating discoveries on individual samples rather than on entire datasets. Through several examples, I show how these techniques allow us to use LMs for expert-level tasks more reliably. Looking forward, I discuss how to use LMs to automate these task decompositions, and how we can use these frameworks to monitor both individual AI systems and their broader impact within society. Bio: Ruiqi Zhong is a final-year Ph.D. student at UC Berkeley, co-advised by Jacob Steinhardt and Dan Klein. He was previously a part-time member of technical staff at Anthropic, where he worked on the automated red teaming team. His research is at the intersection of machine learning and NLP, and he develops language model systems to advance the frontier of human capabilities. He developed the earliest prototype of instruction-tuning, and his research contribution has been scaled up by leading language model companies, such as Google, OpenAI, and Anthropic.

CS Colloquium speakers Speaker: Seah Kim, University of California, Berkeley Date: Monday, March 24 Time: 12:30pm EST Location: CS 105 Host: Margaret Martonosi Event page: https://www.cs.princeton.edu/events/scalable-soc-architectures-domain-specif... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_RfmWTnaZRx63qJXOBiKJHQ Title: Scalable SoC Architectures for Domain-Specific Computing: From Algorithms to Silicon Abstract: Modern software stacks contain concurrent and heterogeneous workloads with bespoke constraints. This is especially crucial for emerging edge applications, such as AR/VR, robotics, and autonomous vehicles. In response to these demands, hardware has shifted towards pervasive specialization, making development and cross-stack integration increasingly challenging. This shift raises the pressing question at the core of modern computing: How do we enable scalable specialization in modern SoCs? How do we design and integrate heterogeneous accelerators while ensuring performance scalability through efficient resource management and adaptability across system layers? In this talk, I will present my research addressing these interconnected challenges of scalable specialization through full-stack, system approaches. (1) First, I will introduce Gemmini, an award-winning, widely used DNN accelerator generator that enables agile, full-stack accelerator evaluation. Gemmini allows researchers to explore the specialized accelerator design space under a full SoC. (2) Next, I will present AuRORA, the award-winning, novel virtualized accelerator integration approach with dynamic resource allocation, paving the foundation for accelerator-rich SoCs. AuRORA redesigns a novel CPU-accelerator interface that enables fast and flexible resource repartitioning, along with a runtime system that abstracts physical accelerators into a unified virtualized resource pool. (3) Then, I will introduce SuperNoVA, an algorithm-hardware co-design for real-time, dynamic workloads on resource-constrained platforms, using SLAM as a target workload. SuperNoVA tackles the challenge of balancing accuracy and real-time execution with an adaptive algorithm for large-scale SLAM. (4) Finally, I will showcase a silicon test chip I taped out that embodies my research by integrating these innovations. Silicon validation with real workloads successfully proves the feasibility of scalable specialization. By bridging hardware design, system software, application algorithms, and silicon validation, my research enables adaptive, accelerator-rich computing platforms for modern edge applications. I will revolutionize edge SoC design by combining design-time hardware-software co-optimization with runtime adaptive resource management, achieving the best of both static specialization and dynamic flexibility to address the evolving demands of future edge platforms. Bio: Seah Kim is a Ph.D. Candidate at UC Berkeley, specializing in Computer Architecture and VLSI. Her research spans the full computing stack, from chip design and hardware development to system software and application algorithms, with a focus on scalable domain-specific SoC design. She has been awarded the IEEE Micro Top Pick in Computer Architecture (MICRO 2023), the Best Paper Award (DAC 2021), and the Distinguished Artifact Award (ISCA 2023). Prior to UC Berkeley, she earned a B.S. in Electrical and Computer Engineering from Seoul National University. Seah was selected as a 2024 Rising Star in EECS and a 2023 ML and Systems Rising Star. Speaker: Haozhi Qi, University of California, Berkeley Date: Tuesday, March 25 Time: 12:30pm EST Location: CS 105 Host: Felix Heide Event page: https://www.cs.princeton.edu/events/multisensory-dexterity-robotics Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_q6wkrHb1TA2K3JuXbXL__g Title: Multisensory Dexterity for Robotics Abstract: Human hands are essential for sensing and interacting with the physical world, allowing us to grasp and manipulate objects with ease. Replicating this dexterity in robots is the key to unlocking general-purpose robotics in unstructured environments. While modern AI has achieved breakthroughs in many domains, robot dexterity remains an unsolved challenge due to the complexity of high-dimensional control, limited real-world data, and the need for rich multisensory feedback. In this talk, I will present my work on multisensory dexterity for robotics and demonstrate how robots can achieve a broad range of dexterous manipulation capabilities. First, I will introduce how robots develop dexterous manipulation using simple sensory inputs and identify the key ingredients that enable generalizable manipulation across diverse objects, with examples in in-hand and bimanual manipulation. Building on these ingredients, I will then show how integrating rich multisensory feedback—including proprioception, vision, and tactile sensing—improves both perception and control, allowing robots to perform tasks that would be impossible with simple sensors. Finally, I will conclude with future opportunities and open challenges in scaling robotic dexterity and developing robots capable of general-purpose physical interaction. Bio: Haozhi Qi is a final-year Ph.D. candidate in the EECS Department at UC Berkeley, advised by Prof. Yi Ma and Prof. Jitendra Malik. His research lies at the intersection of robot learning, computer vision, and tactile sensing, with the goal of developing physically intelligent, particularly dexterous, robots for unstructured environments. He received his B.S. in Mathematics and Computer Science from the Hong Kong University of Science and Technology. His work on in-hand perception was featured as the cover article in Science Robotics. He has been recognized with the Outstanding Demo Award at the NeurIPS Robot Learning Workshop and the EECS Evergreen Award for Undergraduate Researcher Mentoring. Speaker: Prof Moshe Y. Vardi, Rice University Date: Tuesday, March 25 Time: 2:00pm EST Location: CS 105 Host: Aarti Gupta Event page: https://www.cs.princeton.edu/events/what-theoretical-computer-science Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_Cmq0997eQn6Cv0fCXpJl2Q Title: What Is Theoretical Computer Science? Abstract: Wikipedia defines theoretical computer science (TCS) as “a subfield of computer science and mathematics that focuses on the abstract mathematical foundations of computation.” I will take issue with this definition. I believe that thinking of TCS as a branch of mathematics is harmful to the discipline. The centrality of computing stems from the fact that it is a technology that has been changing the world for the past 80 years. As computer scientists, we should look for inspiration from physics rather than from mathematics. Theoretical physics is highly mathematical, but it aims to explain and predict the real world. Theories that fail at this “explain/predict” task would ultimately be discarded. Analogously, I will argue that the role of TCS is to explain/predict real-life computing. We should remember the warning of John von Neumann, one of the greatest mathematicians and computer scientists of the 20th century, regarding the danger of mathematics driven solely by internal esthetics: “There is a grave danger that the subject will develop along the line of least resistance.” I will use Boolean reasoning as the running example to illustrate this thesis. Bio: Moshe Y. Vardi is University Professor and the George Distinguished Service Professor in Computational Engineering at Rice University. His research focuses on the interface of mathematical logic and computation -- including database theory, hardware/software design and verification, multi-agent systems, and constraint satisfaction. He is the recipient of numerous awards, including the ACM SIGACT Goedel Prize, the ACM Kanellakis Award, the ACM SIGMOD Codd Award, the Knuth Prize, the IEEE Computer Society Goode Award, and the EATCS Distinguished Achievements Award. He is the author and co-author of over 800 papers, as well as two books. He is a Guggenheim Fellow as well as fellow of several societies, and a member of several academies, including the US National Academy of Engineering, National Academy of Science, the American Academy of Arts and Science, and the Royal Society of London. He holds ten honorary titles. He is a Senior Editor of the Communications of the ACM, the premier publication in computing.

Speaker: Seah Kim, University of California, Berkeley Date: Monday, March 24 Time: 12:30pm EST Location: CS 105 Host: Margaret Martonosi Event page: https://www.cs.princeton.edu/events/scalable-soc-architectures-domain-specif... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_RfmWTnaZRx63qJXOBiKJHQ Title: Scalable SoC Architectures for Domain-Specific Computing: From Algorithms to Silicon Abstract: Modern software stacks contain concurrent and heterogeneous workloads with bespoke constraints. This is especially crucial for emerging edge applications, such as AR/VR, robotics, and autonomous vehicles. In response to these demands, hardware has shifted towards pervasive specialization, making development and cross-stack integration increasingly challenging. This shift raises the pressing question at the core of modern computing: How do we enable scalable specialization in modern SoCs? How do we design and integrate heterogeneous accelerators while ensuring performance scalability through efficient resource management and adaptability across system layers? In this talk, I will present my research addressing these interconnected challenges of scalable specialization through full-stack, system approaches. (1) First, I will introduce Gemmini, an award-winning, widely used DNN accelerator generator that enables agile, full-stack accelerator evaluation. Gemmini allows researchers to explore the specialized accelerator design space under a full SoC. (2) Next, I will present AuRORA, the award-winning, novel virtualized accelerator integration approach with dynamic resource allocation, paving the foundation for accelerator-rich SoCs. AuRORA redesigns a novel CPU-accelerator interface that enables fast and flexible resource repartitioning, along with a runtime system that abstracts physical accelerators into a unified virtualized resource pool. (3) Then, I will introduce SuperNoVA, an algorithm-hardware co-design for real-time, dynamic workloads on resource-constrained platforms, using SLAM as a target workload. SuperNoVA tackles the challenge of balancing accuracy and real-time execution with an adaptive algorithm for large-scale SLAM. (4) Finally, I will showcase a silicon test chip I taped out that embodies my research by integrating these innovations. Silicon validation with real workloads successfully proves the feasibility of scalable specialization. By bridging hardware design, system software, application algorithms, and silicon validation, my research enables adaptive, accelerator-rich computing platforms for modern edge applications. I will revolutionize edge SoC design by combining design-time hardware-software co-optimization with runtime adaptive resource management, achieving the best of both static specialization and dynamic flexibility to address the evolving demands of future edge platforms. Bio: Seah Kim is a Ph.D. Candidate at UC Berkeley, specializing in Computer Architecture and VLSI. Her research spans the full computing stack, from chip design and hardware development to system software and application algorithms, with a focus on scalable domain-specific SoC design. She has been awarded the IEEE Micro Top Pick in Computer Architecture (MICRO 2023), the Best Paper Award (DAC 2021), and the Distinguished Artifact Award (ISCA 2023). Prior to UC Berkeley, she earned a B.S. in Electrical and Computer Engineering from Seoul National University. Seah was selected as a 2024 Rising Star in EECS and a 2023 ML and Systems Rising Star.

CS Colloquium speakers Speaker: Haoshu Fang, Massachusetts Institute of Technology Date: Tuesday, April 1 Time: 12:30pm EST Location: CS 105 Host: Ben Eysenbach Event page: https://www.cs.princeton.edu/events/science-data-human-level-robotic-manipul... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_i8siw7H2ROmf0g3f9JlGTg Title: Science of Data for Human-Level Robotic Manipulation Abstract: Machine learning has revolutionized many subfields of robotics, from visual perception to task planning. However, the fundamental challenge of low-level motor control for object manipulation with raw sensory observations remains unresolved, primarily due to the lack of robot state and action data during manipulation. This issue is particularly pronounced in tasks requiring multi-finger coordination and fine tactile sensing. Addressing the data problem is essential, as many modalities of robotic data, such as tactile and proprioceptive information, are not readily available online. The key scientific questions in this domain are: (i) how to collect data, (ii) what data to collect, and (iii) how to learn effectively from such data. In this talk, I will (i) introduce a novel paradigm for data collection through the design of innovative interaction interfaces, (ii) demonstrate how identifying key dimensions for scaling data can enable human-level robotic grasping, and (iii) present methods and insights on efficiently learning from heterogeneous robotic data. Bio: Haoshu Fang is a Postdoctoral Researcher at MIT CSAIL, working with Pulkit Agrawal and Edward Adelson. He earned his PhD from Shanghai Jiao Tong University. Haoshu's research focuses on general robotic manipulation, addressing the data challenge by designing data-centric hardware, leveraging data scaling laws, and developing data-efficient learning methods. His work has been recognized with three best paper or nomination awards at top robotics conferences and prestigious fellowships from Microsoft, Baidu, and ByteDance. Speaker: Shishir Patil, University of California, Berkeley Date: Thursday, April 3 Time: 12:30pm EST Location: CS 105 Host: Ravi Netravali Event page: https://www.cs.princeton.edu/events/agenticsystems-teaching-llms-use-tools-s... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_RXiSP_bZThmG6agRH4trNA Title: AgenticSystems: Teaching LLMs to Use Tools at Scale Abstract: In this talk, I will present our vision of AgenticSystems, and introduce our innovative approach in integrating Large Language Models (LLMs) with various tools via APIs. Connecting LLMs with APIs presents a significant challenge, as these models often struggle to generate precise input arguments and are prone to hallucinating API calls. To address this, we developed Gorilla LLM, trained with our novel Retriever-Aware-Training (RAT), setting a new benchmark for tool-use in LLMs. Gorilla also introduces a programming language-inspired metric to quantify hallucinations, a common issue in LLMs. I will conclude by presenting GoEx, a runtime to execute actions generated by LLMs —such as code and API calls—across agents, workflows, and LLM-powered microservices. A key innovation in GoEx is the incorporation of "undo" and "damage confinement" abstractions to mitigate unintended actions and risks. The Gorilla project kick-started tool-calling in LLMs, and with millions of user requests, widespread enterprise adoption—including all leading LLM labs—and a thriving open-source community, the Gorilla project continues to shape the evolving field of tool-calling for agentic LLMs. Bio: Shishir G. Patil is a PhD from UC Berkeley where he was advised by Joseph Gonzalez, Prabal Dutta, and Ion Stoica. He is interested in designing and building efficient machine-learning systems. Recently, his focus has been on teaching LLMs to use tools through API calls. His works include Gorilla LLM, RAFT, OpenFunctions, Berkeley Function Calling Leaderboard, Skyplane, and POET. He was a Research Fellow at Microsoft Research before starting his PhD.

CS Colloquium speakers Speaker: Simran Arora, Stanford University Date: Monday, April 7 Time: 12:30pm EST Location: CS 105 Host: Mae Milano Event page: https://www.cs.princeton.edu/events/pareto-efficient-ai-systems-expanding-qu... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_3VSa9cXIQE24GdQVOrvwCQ Title: Pareto-efficient AI systems: Expanding the quality and efficiency frontier of AI Abstract: We have made exciting progress in AI by scaling massive models on massive amounts of data center compute. However, this represents a small fraction of AI’s potential. My work expands the Pareto frontier between the AI capabilities we can achieve and the long tail of compute constraints. In this talk, we piece-by-piece build up to a language model architecture that expands the Pareto frontier between quality and throughput efficiency. The Transformer, AI’s current workhorse architecture, is memory hungry, limiting its throughput, or amount of data it can process per second. This has led to a Cambrian explosion of alternate architecture candidates proposed across prior work. Prior work paints an exciting picture: there are architectures that are asymptotically faster than the Transformer, while also matching its quality. However, I ask, if we’re using asymptotically faster building blocks, what if anything are we giving up in quality? 1. In part one, we understand the tradeoffs and show indeed, there’s no free lunch. I present my work to identify and explain the fundamental quality and efficiency tradeoffs between different classes of architectures. Methods I developed for this analysis are now ubiquitous in the development of efficient language models. 2. In part two, we measure how existing architecture candidates fare along on the tradeoff space. While many proposed architectures are asymptotically fast, they are not wall-clock fast compared to the Transformer. I present ThunderKittens, a programming library that I built to help AI researchers develop hardware-efficient AI algorithms. 3. In part three, we expand the Pareto frontier of the tradeoff space. I present the BASED architecture, which is built from simple, hardware-efficient components. In culmination, I released a suite of state-of-the-art 8B-405B parameter Transformer-free language models, per standard evaluations, all on an academic budget. Given the massive investment into AI models, this work blending AI and systems has had significant impact and adoption in research, open-source, and industry. Bio: Simran Arora is a PhD student at Stanford University advised by Chris Ré. Her research blends AI and systems towards expanding the Pareto frontier between AI capabilities and efficiency. Her machine learning research has appeared as Oral and Spotlight presentations at NeurIPS, ICML, and ICLR, including an Outstanding Paper award at NeurIPS and Best Paper award at ICML ES-FoMo. Her systems work has appeared at VLDB, SIGMOD, CIDR, and CHI, and her systems artifacts are widely used in research, open-source, and industry. In 2023, Simran created and taught the CS229s Systems for Machine Learning course at Stanford. She has also been supported by a SGF Sequoia Fellowship and the Stanford Computer Science Graduate Fellowship. Speaker: Ofir Press, Princeton University Date: Thursday, April 10 Time: 12:30pm EST Location: CS 105 Host: Peter Henderson Event page: https://www.cs.princeton.edu/events/towards-autonomous-language-model-system... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_fVUS0ZFATVesABqXSe-JUw Title: Towards Autonomous Language Model Systems Abstract: Language models (LMs) are increasingly used to assist users in day to day tasks such as programming (Github Copilot) or search (Google's AI Overviews). But can we build language model systems that are able to autonomously complete entire tasks end-to-end? In this talk I'll discuss our efforts to build autonomous LM systems, focusing on the software engineering domain. I'll present SWE-bench, our novel method for measuring AI systems on their abilities to fix real issues in popular software libraries. I'll then discuss SWE-agent, our system for solving SWE-bench tasks. SWE-bench and SWE-agent are used by many leading AI orgs in academia and industry including OpenAI, Anthropic, Meta, and Google, and SWE-bench has been downloaded over 2 million times. These projects show that academics on tight budgets are able to have substantial impact in steering the research community towards building autonomous systems that can complete challenging tasks. Bio: Ofir Press is a postdoc at Princeton University where he mainly works with Karthik Narasimhan's lab. He previously completed his PhD at the University of Washington in Seattle, where he was advised by Noah Smith. During his PhD he spent two years at Facebook AI Research Labs on Luke Zettlemoyer's team.

Speaker: Haoshu Fang, Massachusetts Institute of Technology Date: Tuesday, April 1 Time: 12:30pm EST Location: CS 105 Host: Ben Eysenbach Event page: https://www.cs.princeton.edu/events/science-data-human-level-robotic-manipul... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_i8siw7H2ROmf0g3f9JlGTg Title: Science of Data for Human-Level Robotic Manipulation Abstract: Machine learning has revolutionized many subfields of robotics, from visual perception to task planning. However, the fundamental challenge of low-level motor control for object manipulation with raw sensory observations remains unresolved, primarily due to the lack of robot state and action data during manipulation. This issue is particularly pronounced in tasks requiring multi-finger coordination and fine tactile sensing. Addressing the data problem is essential, as many modalities of robotic data, such as tactile and proprioceptive information, are not readily available online. The key scientific questions in this domain are: (i) how to collect data, (ii) what data to collect, and (iii) how to learn effectively from such data. In this talk, I will (i) introduce a novel paradigm for data collection through the design of innovative interaction interfaces, (ii) demonstrate how identifying key dimensions for scaling data can enable human-level robotic grasping, and (iii) present methods and insights on efficiently learning from heterogeneous robotic data. Bio: Haoshu Fang is a Postdoctoral Researcher at MIT CSAIL, working with Pulkit Agrawal and Edward Adelson. He earned his PhD from Shanghai Jiao Tong University. Haoshu's research focuses on general robotic manipulation, addressing the data challenge by designing data-centric hardware, leveraging data scaling laws, and developing data-efficient learning methods. His work has been recognized with three best paper or nomination awards at top robotics conferences and prestigious fellowships from Microsoft, Baidu, and ByteDance.

Speaker: Haozhi Qi, University of California, Berkeley Date: Tuesday, March 25 Time: 12:30pm EST Location: CS 105 Host: Felix Heide Event page: https://www.cs.princeton.edu/events/multisensory-dexterity-robotics Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_q6wkrHb1TA2K3JuXbXL__g Title: Multisensory Dexterity for Robotics Abstract: Human hands are essential for sensing and interacting with the physical world, allowing us to grasp and manipulate objects with ease. Replicating this dexterity in robots is the key to unlocking general-purpose robotics in unstructured environments. While modern AI has achieved breakthroughs in many domains, robot dexterity remains an unsolved challenge due to the complexity of high-dimensional control, limited real-world data, and the need for rich multisensory feedback. In this talk, I will present my work on multisensory dexterity for robotics and demonstrate how robots can achieve a broad range of dexterous manipulation capabilities. First, I will introduce how robots develop dexterous manipulation using simple sensory inputs and identify the key ingredients that enable generalizable manipulation across diverse objects, with examples in in-hand and bimanual manipulation. Building on these ingredients, I will then show how integrating rich multisensory feedback—including proprioception, vision, and tactile sensing—improves both perception and control, allowing robots to perform tasks that would be impossible with simple sensors. Finally, I will conclude with future opportunities and open challenges in scaling robotic dexterity and developing robots capable of general-purpose physical interaction. Bio: Haozhi Qi is a final-year Ph.D. candidate in the EECS Department at UC Berkeley, advised by Prof. Yi Ma and Prof. Jitendra Malik. His research lies at the intersection of robot learning, computer vision, and tactile sensing, with the goal of developing physically intelligent, particularly dexterous, robots for unstructured environments. He received his B.S. in Mathematics and Computer Science from the Hong Kong University of Science and Technology. His work on in-hand perception was featured as the cover article in Science Robotics. He has been recognized with the Outstanding Demo Award at the NeurIPS Robot Learning Workshop and the EECS Evergreen Award for Undergraduate Researcher Mentoring.

Speaker: Prof Moshe Y. Vardi, Rice University Date: Tuesday, March 25 Time: 2:00pm EST Location: CS 105 Host: Aarti Gupta Event page: https://www.cs.princeton.edu/events/what-theoretical-computer-science Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_Cmq0997eQn6Cv0fCXpJl2Q Title: What Is Theoretical Computer Science? Abstract: Wikipedia defines theoretical computer science (TCS) as “a subfield of computer science and mathematics that focuses on the abstract mathematical foundations of computation.” I will take issue with this definition. I believe that thinking of TCS as a branch of mathematics is harmful to the discipline. The centrality of computing stems from the fact that it is a technology that has been changing the world for the past 80 years. As computer scientists, we should look for inspiration from physics rather than from mathematics. Theoretical physics is highly mathematical, but it aims to explain and predict the real world. Theories that fail at this “explain/predict” task would ultimately be discarded. Analogously, I will argue that the role of TCS is to explain/predict real-life computing. We should remember the warning of John von Neumann, one of the greatest mathematicians and computer scientists of the 20th century, regarding the danger of mathematics driven solely by internal esthetics: “There is a grave danger that the subject will develop along the line of least resistance.” I will use Boolean reasoning as the running example to illustrate this thesis. Bio: Moshe Y. Vardi is University Professor and the George Distinguished Service Professor in Computational Engineering at Rice University. His research focuses on the interface of mathematical logic and computation -- including database theory, hardware/software design and verification, multi-agent systems, and constraint satisfaction. He is the recipient of numerous awards, including the ACM SIGACT Goedel Prize, the ACM Kanellakis Award, the ACM SIGMOD Codd Award, the Knuth Prize, the IEEE Computer Society Goode Award, and the EATCS Distinguished Achievements Award. He is the author and co-author of over 800 papers, as well as two books. He is a Guggenheim Fellow as well as fellow of several societies, and a member of several academies, including the US National Academy of Engineering, National Academy of Science, the American Academy of Arts and Science, and the Royal Society of London. He holds ten honorary titles. He is a Senior Editor of the Communications of the ACM, the premier publication in computing.

CS Colloquium speakers Speaker: Qianqian Wang, University of California, Berkeley Date: Monday, March 17 Time: 12:30pm EST Location: CS 105 Host: Olga Russakovsky Event page: https://www.cs.princeton.edu/events/learning-perceive-4d-world Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_h_qEBInjR5q-bD5R2unZqw Title: Learning to Perceive the 4D World Abstract: Perceiving the 4D world (i.e., 3D space over time) from visual input is essential for human interaction with the physical environment. While computer vision has made remarkable progress in 3D scene understanding, much of it remains piecemeal—for example, focusing solely on static scenes or specific categories of dynamic objects. How can we model diverse dynamic scenes in the wild? How can we achieve online perception with human-like capabilities? In this talk, I will first discuss holistic scene representations that enable long-range motion estimation and 4D reconstruction. I will then introduce a unified learning-based framework for online dense 3D perception, which continuously refines scene understanding with new observations. I will conclude by discussing future directions and challenges in advancing spatial intelligence. Bio: Qianqian Wang is a postdoctoral researcher at UC Berkeley, working with Prof. Angjoo Kanazawa and Prof. Alexei A. Efros. She received her Ph.D. in Computer Science from Cornell University in 2023, advised by Prof. Noah Snavely and Prof. Bharath Hariharan. Her research lies at the intersection of computer vision, computer graphics, and machine learning. She is a recipient of the ICCV Best Student Paper Award, CVPR Best Paper Honorable Mention, Cornell CS Dissertation Award, Google PhD Fellowship, and EECS Rising Stars. Speaker: Simran Arora, Stanford University Date: Tuesday, March 18 Time: 12:30pm EST Location: CS 105 Host: Mae Milano Event page: https://www.cs.princeton.edu/events/pareto-efficient-ai-systems-expanding-qu... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_3VSa9cXIQE24GdQVOrvwCQ Title: Pareto-efficient AI systems: Expanding the quality and efficiency frontier of AI Abstract: We have made exciting progress in AI by scaling massive models on massive amounts of data center compute. However, this represents a small fraction of AI’s potential. My work expands the Pareto frontier between the AI capabilities we can achieve and the long tail of compute constraints. In this talk, we piece-by-piece build up to a language model architecture that expands the Pareto frontier between quality and throughput efficiency. The Transformer, AI’s current workhorse architecture, is memory hungry, limiting its throughput, or amount of data it can process per second. This has led to a Cambrian explosion of alternate architecture candidates proposed across prior work. Prior work paints an exciting picture: there are architectures that are asymptotically faster than the Transformer, while also matching its quality. However, I ask, if we’re using asymptotically faster building blocks, what if anything are we giving up in quality? 1. In part one, we understand the tradeoffs and show indeed, there’s no free lunch. I present my work to identify and explain the fundamental quality and efficiency tradeoffs between different classes of architectures. Methods I developed for this analysis are now ubiquitous in the development of efficient language models. 2. In part two, we measure how existing architecture candidates fare along on the tradeoff space. While many proposed architectures are asymptotically fast, they are not wall-clock fast compared to the Transformer. I present ThunderKittens, a programming library that I built to help AI researchers develop hardware-efficient AI algorithms. 3. In part three, we expand the Pareto frontier of the tradeoff space. I present the BASED architecture, which is built from simple, hardware-efficient components. In culmination, I released a suite of state-of-the-art 8B-405B parameter Transformer-free language models, per standard evaluations, all on an academic budget. Given the massive investment into AI models, this work blending AI and systems has had significant impact and adoption in research, open-source, and industry. Bio: Simran Arora is a PhD student at Stanford University advised by Chris Ré. Her research blends AI and systems towards expanding the Pareto frontier between AI capabilities and efficiency. Her machine learning research has appeared as Oral and Spotlight presentations at NeurIPS, ICML, and ICLR, including an Outstanding Paper award at NeurIPS and Best Paper award at ICML ES-FoMo. Her systems work has appeared at VLDB, SIGMOD, CIDR, and CHI, and her systems artifacts are widely used in research, open-source, and industry. In 2023, Simran created and taught the CS229s Systems for Machine Learning course at Stanford. She has also been supported by a SGF Sequoia Fellowship and the Stanford Computer Science Graduate Fellowship. Speaker: Olivia Hsu, Stanford University Date: Thursday, March 20 Time: 12:30pm EST Location: CS 105 Host: Brian Kernighan Event page: https://www.cs.princeton.edu/events/language-silicon-programming-systems-spa... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_j-QIWzFvR1mwBO3qOP9atg Title: From Language to Silicon: Programming Systems for Sparse Accelerators Abstract: In this era of specialization, modern hardware development focuses on domain-specific accelerator design due to the plateau in technology scaling combined with a continual need for performance. However, domain-specific programming systems for these accelerators require extreme engineering effort, and their complexity has largely caused them to lag behind. Fundamentally, the widespread usability, proliferation, and democratization of domain-specific accelerators hinge on their programming systems, especially when targeting new domains. This talk presents research on accelerator programming systems for the emerging domain of sparse computation. The first system, the Sparse Abstract Machine (SAM), introduces a unified abstract machine model and compiler abstraction for sparse dataflow accelerators. SAM defines a novel streaming representation and abstract dataflow interfaces that serve as an abstraction to decouple sparse accelerator implementations from their programs, similar to a stable ISA but for dataflow. The second system, Mosaic, introduces modular and portable compilation solutions that can leverage heterogeneous sparse accelerators and high-performance systems within the same system. These systems are a first step towards usable and programmable heterogeneous hardware acceleration for all. I will conclude by discussing the next steps to reach this goal, which include programming systems for accelerators in other domains and interoperation between accelerators across domains. Bio: Olivia Hsu is a final-year Ph.D. candidate at Stanford University in the Department of Computer Science, advised by Professors Kunle Olukotun and Fredrik Kjolstad. She received her B.S. in Electrical Engineering and Computer Science (EECS) at UC Berkeley. Her broad research interests include computer architecture, computer and programming systems, compilers, programming languages, and digital circuits/VLSI. Olivia is a 2024 Rising Star in EECS and an NSF Graduate Research Fellow, and her research won a distinguished paper award at PLDI 2023. To learn more about her work, please visit her website at https://cs.stanford.edu/~owhsu.

Speaker: Qianqian Wang, University of California, Berkeley Date: Monday, March 17 Time: 12:30pm EST Location: CS 105 Host: Olga Russakovsky Event page: https://www.cs.princeton.edu/events/learning-perceive-4d-world Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_h_qEBInjR5q-bD5R2unZqw Title: Learning to Perceive the 4D World Abstract: Perceiving the 4D world (i.e., 3D space over time) from visual input is essential for human interaction with the physical environment. While computer vision has made remarkable progress in 3D scene understanding, much of it remains piecemeal—for example, focusing solely on static scenes or specific categories of dynamic objects. How can we model diverse dynamic scenes in the wild? How can we achieve online perception with human-like capabilities? In this talk, I will first discuss holistic scene representations that enable long-range motion estimation and 4D reconstruction. I will then introduce a unified learning-based framework for online dense 3D perception, which continuously refines scene understanding with new observations. I will conclude by discussing future directions and challenges in advancing spatial intelligence. Bio: Qianqian Wang is a postdoctoral researcher at UC Berkeley, working with Prof. Angjoo Kanazawa and Prof. Alexei A. Efros. She received her Ph.D. in Computer Science from Cornell University in 2023, advised by Prof. Noah Snavely and Prof. Bharath Hariharan. Her research lies at the intersection of computer vision, computer graphics, and machine learning. She is a recipient of the ICCV Best Student Paper Award, CVPR Best Paper Honorable Mention, Cornell CS Dissertation Award, Google PhD Fellowship, and EECS Rising Stars.

This is a notice that tomorrow's talk has been postponed . We'll follow up with the new date once it's scheduled. Speaker: Simran Arora, Stanford University Date: Tuesday, March 18 Time: 12:30pm EST Location: CS 105 Host: Mae Milano Event page: https://www.cs.princeton.edu/events/pareto-efficient-ai-systems-expanding-qu... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_3VSa9cXIQE24GdQVOrvwCQ Title: Pareto-efficient AI systems: Expanding the quality and efficiency frontier of AI Abstract: We have made exciting progress in AI by scaling massive models on massive amounts of data center compute. However, this represents a small fraction of AI’s potential. My work expands the Pareto frontier between the AI capabilities we can achieve and the long tail of compute constraints. In this talk, we piece-by-piece build up to a language model architecture that expands the Pareto frontier between quality and throughput efficiency. The Transformer, AI’s current workhorse architecture, is memory hungry, limiting its throughput, or amount of data it can process per second. This has led to a Cambrian explosion of alternate architecture candidates proposed across prior work. Prior work paints an exciting picture: there are architectures that are asymptotically faster than the Transformer, while also matching its quality. However, I ask, if we’re using asymptotically faster building blocks, what if anything are we giving up in quality? 1. In part one, we understand the tradeoffs and show indeed, there’s no free lunch. I present my work to identify and explain the fundamental quality and efficiency tradeoffs between different classes of architectures. Methods I developed for this analysis are now ubiquitous in the development of efficient language models. 2. In part two, we measure how existing architecture candidates fare along on the tradeoff space. While many proposed architectures are asymptotically fast, they are not wall-clock fast compared to the Transformer. I present ThunderKittens, a programming library that I built to help AI researchers develop hardware-efficient AI algorithms. 3. In part three, we expand the Pareto frontier of the tradeoff space. I present the BASED architecture, which is built from simple, hardware-efficient components. In culmination, I released a suite of state-of-the-art 8B-405B parameter Transformer-free language models, per standard evaluations, all on an academic budget. Given the massive investment into AI models, this work blending AI and systems has had significant impact and adoption in research, open-source, and industry. Bio: Simran Arora is a PhD student at Stanford University advised by Chris Ré. Her research blends AI and systems towards expanding the Pareto frontier between AI capabilities and efficiency. Her machine learning research has appeared as Oral and Spotlight presentations at NeurIPS, ICML, and ICLR, including an Outstanding Paper award at NeurIPS and Best Paper award at ICML ES-FoMo. Her systems work has appeared at VLDB, SIGMOD, CIDR, and CHI, and her systems artifacts are widely used in research, open-source, and industry. In 2023, Simran created and taught the CS229s Systems for Machine Learning course at Stanford. She has also been supported by a SGF Sequoia Fellowship and the Stanford Computer Science Graduate Fellowship.

Speaker: Olivia Hsu, Stanford University Date: Thursday, March 20 Time: 12:30pm EST Location: CS 105 Host: Brian Kernighan Event page: https://www.cs.princeton.edu/events/language-silicon-programming-systems-spa... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_j-QIWzFvR1mwBO3qOP9atg Title: From Language to Silicon: Programming Systems for Sparse Accelerators Abstract: In this era of specialization, modern hardware development focuses on domain-specific accelerator design due to the plateau in technology scaling combined with a continual need for performance. However, domain-specific programming systems for these accelerators require extreme engineering effort, and their complexity has largely caused them to lag behind. Fundamentally, the widespread usability, proliferation, and democratization of domain-specific accelerators hinge on their programming systems, especially when targeting new domains. This talk presents research on accelerator programming systems for the emerging domain of sparse computation. The first system, the Sparse Abstract Machine (SAM), introduces a unified abstract machine model and compiler abstraction for sparse dataflow accelerators. SAM defines a novel streaming representation and abstract dataflow interfaces that serve as an abstraction to decouple sparse accelerator implementations from their programs, similar to a stable ISA but for dataflow. The second system, Mosaic, introduces modular and portable compilation solutions that can leverage heterogeneous sparse accelerators and high-performance systems within the same system. These systems are a first step towards usable and programmable heterogeneous hardware acceleration for all. I will conclude by discussing the next steps to reach this goal, which include programming systems for accelerators in other domains and interoperation between accelerators across domains. Bio: Olivia Hsu is a final-year Ph.D. candidate at Stanford University in the Department of Computer Science, advised by Professors Kunle Olukotun and Fredrik Kjolstad. She received her B.S. in Electrical Engineering and Computer Science (EECS) at UC Berkeley. Her broad research interests include computer architecture, computer and programming systems, compilers, programming languages, and digital circuits/VLSI. Olivia is a 2024 Rising Star in EECS and an NSF Graduate Research Fellow, and her research won a distinguished paper award at PLDI 2023. To learn more about her work, please visit her website at https://cs.stanford.edu/~owhsu.
participants (1)
-
Emily C. Lawrence