CS Colloquium Speakers

Speaker: Sewon Min, University of Washington
Date: Monday, March 4
Time: 12:30pm EST
Location: CS 105
Host: Danqi Chen
Event page: https://www.cs.princeton.edu/events/26579
Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_5uVGEpXNTVe8eBNSLpxKtg

Title: Rethinking Data Use in Large Language Models

Abstract: Large language models (LMs) such as ChatGPT have revolutionized natural language processing and artificial intelligence more broadly. In this talk, I will discuss my research on understanding and advancing these models, centered around how they use the very large text corpora they are trained on. First, I will describe our efforts to understand how these models learn to perform new tasks after training, demonstrating that their so-called in context learning capabilities are almost entirely determined by what they learn from the training data. Next, I will introduce a new class of LMs—nonparametric LMs—that repurpose this training data as a data store from which they retrieve information for improved accuracy and updatability. I will describe my work on establishing the foundations of such models, including one of the first broadly used neural retrieval models and an approach that simplifies a traditional, two-stage pipeline into one. I will also discuss how nonparametric models open up new avenues for responsible data use, e.g., by segregating permissive and copyrighted text and using them differently. Finally, I will envision the next generation of LMs we should build, focusing on efficient scaling, improved factuality, and decentralization.

Bio: Sewon Min is a Ph.D. candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Her research focuses on language models (LMs): studying the science of LMs, and designing new model classes and learning methods that make LMs more performant and flexible. She also studies LMs in information-seeking, legal, and privacy contexts. She is a co-organizer of multiple tutorials and workshops, including most recently at ACL 2023 on Retrieval-based Language Models and Applications and upcoming at ICLR 2024 on Mathematical and Empirical Understanding of Foundation Models. She won a paper award at ACL 2023, received a J.P. Morgan Fellowship, and was named an EECS rising star in 2022.
Speaker: Lianmin Zheng, University of California, Berkeley
Date: Tuesday, March 5
Time: 12:30pm EST
Location: CS 105
Host: Ravi Netravali
Event page: https://www.cs.princeton.edu/events/26593
Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_ghiT2HbhSPqKBouJ-VwzFg

Title: Scalable and Efficient Systems for Large Language Models

Abstract: Large Language Models (LLMs) have been driving recent breakthroughs in AI. These advancements would not have been possible without the support of scalable and efficient infrastructure systems. In this talk, I will introduce several underlying systems I have designed and built to support the entire model lifecycle, from training to deployment to evaluation. First, I will present Alpa, a system for large-scale model-parallel training that automatically generates execution plans unifying data, operator, and pipeline parallelism. Next, I will discuss efficient deployment systems, covering the frontend programming interface and backend runtime optimizations for high-performance inference. Finally, I will complete the model lifecycle by presenting our model evaluation efforts, including the crowdsourced live benchmark platform, Chatbot Arena, and the automatic evaluation pipeline, LLM-as-a-Judge. These projects have collectively laid a solid foundation for large language model systems, being widely adopted by leading LLM developers and companies. I will conclude by outlining some future directions of machine learning systems, such as co-optimizing across the full stack for building AI-centric applications.

Bio: Lianmin Zheng is a Ph.D. student in the EECS department at UC Berkeley, advised by Ion Stoica and Joseph E. Gonzalez. His research interests include machine learning systems, large language models, compilers, and distributed systems. He builds full-stack, scalable, and efficient systems to advance the development of AI. He co-founded LMSYS.org, where he leads impactful open-source large language model projects such as Vicuna and Chatbot Arena, which have received millions of downloads and served millions of users. He also co-organized the Big Model Tutorial at ICML 2022. He has received a Meta Ph.D. Fellowship, an IEEE Micro Best Paper Award, and an a16z open-source AI grant.


Speaker: Emma Dauterman, University of California, Berkeley
Date: Thursday, March 7
Time: 12:30pm EST
Location: CS 105
Host: Amit Levy
Event page: https://www.cs.princeton.edu/events/26585
Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_soNesOKXSFml9MPaTx_1iw

Title: Secure systems from insecure components

Abstract: In many computer systems today, an attacker that breaks one system component can steal data from millions of users. In this talk, I will present two systems that can withstand component compromise. I will describe (1) a single sign-on system that protects user security and privacy from a compromised single sign-on server, and (2) a secure-hardware-based backup service that protects user backups from compromised secure hardware devices. These systems provide strong security and privacy properties while taking into account practical constraints such as compatibility requirements, hardware limitations, and user expectations. Each splits user secrets across different system components, using new cryptographic tools to provide necessary functionality while protecting user data.

Bio: Emma Dauterman is a Ph.D. candidate at UC Berkeley where she is advised by Raluca Ada Popa and Ion Stoica. Her research interests include computer security, systems, and applied cryptography. She has received the Microsoft Research Ada Lovelace fellowship, the NSF graduate research fellowship, and a UC Berkeley EECS excellence award.