CS Colloquium Speakers: week of March 4
CS Colloquium Speakers Speaker: Sewon Min, University of Washington Date: Monday, March 4 Time: 12:30pm EST Location: CS 105 Host: Danqi Chen Event page: https://www.cs.princeton.edu/events/26579 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_5uVGEpXNTVe8eBNSLpxKtg Title: Rethinking Data Use in Large Language Models Abstract: Large language models (LMs) such as ChatGPT have revolutionized natural language processing and artificial intelligence more broadly. In this talk, I will discuss my research on understanding and advancing these models, centered around how they use the very large text corpora they are trained on. First, I will describe our efforts to understand how these models learn to perform new tasks after training, demonstrating that their so-called in context learning capabilities are almost entirely determined by what they learn from the training data. Next, I will introduce a new class of LMs—nonparametric LMs—that repurpose this training data as a data store from which they retrieve information for improved accuracy and updatability. I will describe my work on establishing the foundations of such models, including one of the first broadly used neural retrieval models and an approach that simplifies a traditional, two-stage pipeline into one. I will also discuss how nonparametric models open up new avenues for responsible data use, e.g., by segregating permissive and copyrighted text and using them differently. Finally, I will envision the next generation of LMs we should build, focusing on efficient scaling, improved factuality, and decentralization. Bio: Sewon Min is a Ph.D. candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Her research focuses on language models (LMs): studying the science of LMs, and designing new model classes and learning methods that make LMs more performant and flexible. She also studies LMs in information-seeking, legal, and privacy contexts. She is a co-organizer of multiple tutorials and workshops, including most recently at ACL 2023 on Retrieval-based Language Models and Applications and upcoming at ICLR 2024 on Mathematical and Empirical Understanding of Foundation Models. She won a paper award at ACL 2023, received a J.P. Morgan Fellowship, and was named an EECS rising star in 2022. Speaker: Lianmin Zheng, University of California, Berkeley Date: Tuesday, March 5 Time: 12:30pm EST Location: CS 105 Host: Ravi Netravali Event page: https://www.cs.princeton.edu/events/26593 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_ghiT2HbhSPqKBouJ-VwzFg Title: Scalable and Efficient Systems for Large Language Models Abstract: Large Language Models (LLMs) have been driving recent breakthroughs in AI. These advancements would not have been possible without the support of scalable and efficient infrastructure systems. In this talk, I will introduce several underlying systems I have designed and built to support the entire model lifecycle, from training to deployment to evaluation. First, I will present Alpa, a system for large-scale model-parallel training that automatically generates execution plans unifying data, operator, and pipeline parallelism. Next, I will discuss efficient deployment systems, covering the frontend programming interface and backend runtime optimizations for high-performance inference. Finally, I will complete the model lifecycle by presenting our model evaluation efforts, including the crowdsourced live benchmark platform, Chatbot Arena, and the automatic evaluation pipeline, LLM-as-a-Judge. These projects have collectively laid a solid foundation for large language model systems, being widely adopted by leading LLM developers and companies. I will conclude by outlining some future directions of machine learning systems, such as co-optimizing across the full stack for building AI-centric applications. Bio: Lianmin Zheng is a Ph.D. student in the EECS department at UC Berkeley, advised by Ion Stoica and Joseph E. Gonzalez. His research interests include machine learning systems, large language models, compilers, and distributed systems. He builds full-stack, scalable, and efficient systems to advance the development of AI. He co-founded LMSYS.org, where he leads impactful open-source large language model projects such as Vicuna and Chatbot Arena, which have received millions of downloads and served millions of users. He also co-organized the Big Model Tutorial at ICML 2022. He has received a Meta Ph.D. Fellowship, an IEEE Micro Best Paper Award, and an a16z open-source AI grant. Speaker: Emma Dauterman, University of California, Berkeley Date: Thursday, March 7 Time: 12:30pm EST Location: CS 105 Host: Amit Levy Event page: https://www.cs.princeton.edu/events/26585 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_soNesOKXSFml9MPaTx_1iw Title: Secure systems from insecure components Abstract: In many computer systems today, an attacker that breaks one system component can steal data from millions of users. In this talk, I will present two systems that can withstand component compromise. I will describe (1) a single sign-on system that protects user security and privacy from a compromised single sign-on server, and (2) a secure-hardware-based backup service that protects user backups from compromised secure hardware devices. These systems provide strong security and privacy properties while taking into account practical constraints such as compatibility requirements, hardware limitations, and user expectations. Each splits user secrets across different system components, using new cryptographic tools to provide necessary functionality while protecting user data. Bio: Emma Dauterman is a Ph.D. candidate at UC Berkeley where she is advised by Raluca Ada Popa and Ion Stoica. Her research interests include computer security, systems, and applied cryptography. She has received the Microsoft Research Ada Lovelace fellowship, the NSF graduate research fellowship, and a UC Berkeley EECS excellence award.
Speaker: Sewon Min, University of Washington Date: Monday, March 4 Time: 12:30pm EST Location: CS 105 Host: Danqi Chen Event page: https://www.cs.princeton.edu/events/26579 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_5uVGEpXNTVe8eBNSLpxKtg Title: Rethinking Data Use in Large Language Models Abstract: Large language models (LMs) such as ChatGPT have revolutionized natural language processing and artificial intelligence more broadly. In this talk, I will discuss my research on understanding and advancing these models, centered around how they use the very large text corpora they are trained on. First, I will describe our efforts to understand how these models learn to perform new tasks after training, demonstrating that their so-called in context learning capabilities are almost entirely determined by what they learn from the training data. Next, I will introduce a new class of LMs—nonparametric LMs—that repurpose this training data as a data store from which they retrieve information for improved accuracy and updatability. I will describe my work on establishing the foundations of such models, including one of the first broadly used neural retrieval models and an approach that simplifies a traditional, two-stage pipeline into one. I will also discuss how nonparametric models open up new avenues for responsible data use, e.g., by segregating permissive and copyrighted text and using them differently. Finally, I will envision the next generation of LMs we should build, focusing on efficient scaling, improved factuality, and decentralization. Bio: Sewon Min is a Ph.D. candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Her research focuses on language models (LMs): studying the science of LMs, and designing new model classes and learning methods that make LMs more performant and flexible. She also studies LMs in information-seeking, legal, and privacy contexts. She is a co-organizer of multiple tutorials and workshops, including most recently at ACL 2023 on Retrieval-based Language Models and Applications and upcoming at ICLR 2024 on Mathematical and Empirical Understanding of Foundation Models. She won a paper award at ACL 2023, received a J.P. Morgan Fellowship, and was named an EECS rising star in 2022.
Speaker: Lianmin Zheng, University of California, Berkeley Date: Tuesday, March 5 Time: 12:30pm EST Location: CS 105 Host: Ravi Netravali Event page: https://www.cs.princeton.edu/events/26593 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_ghiT2HbhSPqKBouJ-VwzFg Title: Scalable and Efficient Systems for Large Language Models Abstract: Large Language Models (LLMs) have been driving recent breakthroughs in AI. These advancements would not have been possible without the support of scalable and efficient infrastructure systems. In this talk, I will introduce several underlying systems I have designed and built to support the entire model lifecycle, from training to deployment to evaluation. First, I will present Alpa, a system for large-scale model-parallel training that automatically generates execution plans unifying data, operator, and pipeline parallelism. Next, I will discuss efficient deployment systems, covering the frontend programming interface and backend runtime optimizations for high-performance inference. Finally, I will complete the model lifecycle by presenting our model evaluation efforts, including the crowdsourced live benchmark platform, Chatbot Arena, and the automatic evaluation pipeline, LLM-as-a-Judge. These projects have collectively laid a solid foundation for large language model systems, being widely adopted by leading LLM developers and companies. I will conclude by outlining some future directions of machine learning systems, such as co-optimizing across the full stack for building AI-centric applications. Bio: Lianmin Zheng is a Ph.D. student in the EECS department at UC Berkeley, advised by Ion Stoica and Joseph E. Gonzalez. His research interests include machine learning systems, large language models, compilers, and distributed systems. He builds full-stack, scalable, and efficient systems to advance the development of AI. He co-founded LMSYS.org, where he leads impactful open-source large language model projects such as Vicuna and Chatbot Arena, which have received millions of downloads and served millions of users. He also co-organized the Big Model Tutorial at ICML 2022. He has received a Meta Ph.D. Fellowship, an IEEE Micro Best Paper Award, and an a16z open-source AI grant.
Speaker: Emma Dauterman, University of California, Berkeley Date: Thursday, March 7 Time: 12:30pm EST Location: CS 105 Host: Amit Levy Event page: https://www.cs.princeton.edu/events/26585 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_soNesOKXSFml9MPaTx_1iw Title: Secure systems from insecure components Abstract: In many computer systems today, an attacker that breaks one system component can steal data from millions of users. In this talk, I will present two systems that can withstand component compromise. I will describe (1) a single sign-on system that protects user security and privacy from a compromised single sign-on server, and (2) a secure-hardware-based backup service that protects user backups from compromised secure hardware devices. These systems provide strong security and privacy properties while taking into account practical constraints such as compatibility requirements, hardware limitations, and user expectations. Each splits user secrets across different system components, using new cryptographic tools to provide necessary functionality while protecting user data. Bio: Emma Dauterman is a Ph.D. candidate at UC Berkeley where she is advised by Raluca Ada Popa and Ion Stoica. Her research interests include computer security, systems, and applied cryptography. She has received the Microsoft Research Ada Lovelace fellowship, the NSF graduate research fellowship, and a UC Berkeley EECS excellence award.
ECE/CS Colloquium Speaker Speaker: Max Simchowitz, Massachusetts Institute of Technology Date: Wednesday, March 20 Time: 12:30pm EST Location: B205 E-Quad Host: Elad Hazan, Chi Jin Event page: https://ece.princeton.edu/events/mathematical-foundations-physical-agents Title: Mathematical Foundations for Physical Agents Abstract: From robotics to autonomous vehicles, machine learning agents deployed in the physical world (“physical agents”) promise to revolutionize endeavors ranging from manufacturing to agriculture to domestic labor. In this talk, we will develop mathematical foundations, from the ground up, for how to carry out this vision. We will begin our investigation by examining linear dynamical systems, a simple and fundamental model of the interaction between a physical agent and its environment. We prove mathematically that simple exploration attains optimal performance for some of both the simplest and the most complex learning problems in this class. The above finding, while powerful, strongly motivates moving past linear dynamics as a mathematical testbed for understanding learning with physical agents. Hence, we turn to providing mathematical guarantees for a setting of real-world importance that does not fit the linear mold: behavior cloning. Behavior cloning — teaching a robot to imitate from example demonstrations — lies at the heart of many of today’s most promising robot learning endeavors due to its intuitive data collection and simplicity. Though it can work incredibly well, we still do not have a clear understanding of what circumstances ensure its success. Bringing together the flexibility of generative models with key intuitions arising from the study of linear control, we introduce a framework for behavior cloning that enables an agent to imitate nearly arbitrary behavior with provable guarantees, even when the dynamics governing the agent and environments interaction are nonlinear. We conclude by outlining ongoing work and future steps towards building out the mathematical and conceptual tooling for understanding the next steps towards general, capable and flexible physical agents. Bio: Max Simchowitz is a postdoctoral researcher in the Robot Locomotion Group at MIT CSAIL. He studies the theoretical foundations of machine learning problems with a sequential or dynamical component; he currently focuses on robotics and out-of-distribution learning, and with past work ranging broadly across control, reinforcement learning, optimization and algorithmic fairness. He received his PhD from University of California, Berkeley in 2021 under Ben Recht and Michael I. Jordan, and his work has been recognized with an ICML 2018 Best Paper Award, ICML 2022 Outstanding Paper Award, and RSS 2023 Best Paper Finalist designation. CS Colloquium Speaker Speaker: Zhuang Liu, Meta AI Research Date: Thursday, March 21 Time: 12:30pm EST Location: CS 105 Host: Jia Deng Event page: https://www.cs.princeton.edu/events/26580 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_S-u-GtwMT1Sn1R3nU-jODQ Title: Scaling Deep Learning Up and Down Abstract: Deep learning with neural networks has emerged as a key approach for discovering patterns and modeling relationships in complex data. AI systems powered by deep learning are used widely in applications across a broad spectrum of scales. There have been strong needs for scaling deep learning both upward and downward. Scaling up highlights the pursuit of scalability - the ability to utilize increasingly abundant computing and data resources to achieve superior capabilities, overcoming diminishing returns. Scaling down represents the demand for efficiency - there is limited data for many application domains, and deployment is often in compute-limited settings. My research focuses on scaling deep learning both up and down, to build capable models and understand their behaviors in different computational and data environments. In this talk, we present studies in both directions. For scaling up, we first explore the design of scalable neural network architectures that are widely adopted in various fields. We then discuss an intriguing observation on modern vision datasets and its implication on scaling training data. For scaling down, we introduce simple, effective, and popularly used approaches for compressing convolutional networks and large language models, alongside interesting empirical findings. Notably, a recurring theme in this talk is the careful examination of implicit assumptions in the literature, which often leads to surprising revelations that reshape community understanding. Finally, we discuss exciting avenues for future deep learning and vision research, such as developing next-gen architectures and modeling datasets. Bio: Zhuang Liu is currently a Research Scientist at Meta AI Research (FAIR) in New York City. He received his Ph.D. from UC Berkeley EECS in 2022, advised by Trevor Darrell. His research areas include deep learning and computer vision. His work focuses on scaling neural networks both up and down, to build capable models and understand their behaviors in different computational and data environments. His work is broadly applied in different areas of computing and other disciplines. He is a recipient of the CVPR 2017 Best Paper Award.
**Please note, this Thursday's talk by Zhuang Liu has been RESCHEDULED for Tuesday, April 9.** CS Colloquium Speaker Speaker: Zhuang Liu, Meta AI Research Date: Thursday, March 21 Tuesday, April 9 Time: 12:30pm EST Location: CS 105 Host: Jia Deng Event page: https://www.cs.princeton.edu/events/26580 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_S-u-GtwMT1Sn1R3nU-jODQ Title: Scaling Deep Learning Up and Down Abstract: Deep learning with neural networks has emerged as a key approach for discovering patterns and modeling relationships in complex data. AI systems powered by deep learning are used widely in applications across a broad spectrum of scales. There have been strong needs for scaling deep learning both upward and downward. Scaling up highlights the pursuit of scalability - the ability to utilize increasingly abundant computing and data resources to achieve superior capabilities, overcoming diminishing returns. Scaling down represents the demand for efficiency - there is limited data for many application domains, and deployment is often in compute-limited settings. My research focuses on scaling deep learning both up and down, to build capable models and understand their behaviors in different computational and data environments. In this talk, we present studies in both directions. For scaling up, we first explore the design of scalable neural network architectures that are widely adopted in various fields. We then discuss an intriguing observation on modern vision datasets and its implication on scaling training data. For scaling down, we introduce simple, effective, and popularly used approaches for compressing convolutional networks and large language models, alongside interesting empirical findings. Notably, a recurring theme in this talk is the careful examination of implicit assumptions in the literature, which often leads to surprising revelations that reshape community understanding. Finally, we discuss exciting avenues for future deep learning and vision research, such as developing next-gen architectures and modeling datasets. Bio: Zhuang Liu is currently a Research Scientist at Meta AI Research (FAIR) in New York City. He received his Ph.D. from UC Berkeley EECS in 2022, advised by Trevor Darrell. His research areas include deep learning and computer vision. His work focuses on scaling neural networks both up and down, to build capable models and understand their behaviors in different computational and data environments. His work is broadly applied in different areas of computing and other disciplines. He is a recipient of the CVPR 2017 Best Paper Award.
ECE/CS Colloquium Speaker Speaker: Max Simchowitz, Massachusetts Institute of Technology Date: Wednesday, March 20 Time: 12:30pm EST Location: B205 E-Quad Host: Elad Hazan, Chi Jin Event page: https://ece.princeton.edu/events/mathematical-foundations-physical-agents Title: Mathematical Foundations for Physical Agents Abstract: From robotics to autonomous vehicles, machine learning agents deployed in the physical world (“physical agents”) promise to revolutionize endeavors ranging from manufacturing to agriculture to domestic labor. In this talk, we will develop mathematical foundations, from the ground up, for how to carry out this vision. We will begin our investigation by examining linear dynamical systems, a simple and fundamental model of the interaction between a physical agent and its environment. We prove mathematically that simple exploration attains optimal performance for some of both the simplest and the most complex learning problems in this class. The above finding, while powerful, strongly motivates moving past linear dynamics as a mathematical testbed for understanding learning with physical agents. Hence, we turn to providing mathematical guarantees for a setting of real-world importance that does not fit the linear mold: behavior cloning. Behavior cloning — teaching a robot to imitate from example demonstrations — lies at the heart of many of today’s most promising robot learning endeavors due to its intuitive data collection and simplicity. Though it can work incredibly well, we still do not have a clear understanding of what circumstances ensure its success. Bringing together the flexibility of generative models with key intuitions arising from the study of linear control, we introduce a framework for behavior cloning that enables an agent to imitate nearly arbitrary behavior with provable guarantees, even when the dynamics governing the agent and environments interaction are nonlinear. We conclude by outlining ongoing work and future steps towards building out the mathematical and conceptual tooling for understanding the next steps towards general, capable and flexible physical agents. Bio: Max Simchowitz is a postdoctoral researcher in the Robot Locomotion Group at MIT CSAIL. He studies the theoretical foundations of machine learning problems with a sequential or dynamical component; he currently focuses on robotics and out-of-distribution learning, and with past work ranging broadly across control, reinforcement learning, optimization and algorithmic fairness. He received his PhD from University of California, Berkeley in 2021 under Ben Recht and Michael I. Jordan, and his work has been recognized with an ICML 2018 Best Paper Award, ICML 2022 Outstanding Paper Award, and RSS 2023 Best Paper Finalist designation.
CS Colloquium Speaker Speaker: Silvia Sellán, University of Toronto Date: Monday, March 25 Time: 12:30pm EST Location: CS 105 Host: Adam Finkelstein Event page: https://www.cs.princeton.edu/events/26599 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_-avKe73_RbqHneUpsPypHw Title: Stochastic Computer Graphics Abstract: Computer Graphics research has long been dominated by the interests of large film, television and social media companies, forcing other, more safety-critical applications (e.g., medicine, engineering, security) to repurpose Graphics algorithms originally designed for entertainment. In this talk, I will advocate for a perspective shift in our field that allows us to design algorithms directly for these safety-critical application realms. I will show that this begins by reinterpreting traditional Graphics tasks (e.g., 3D modeling and reconstruction) from a statistical lens and quantifying the uncertainty in our algorithmic outputs, as exemplified by the research I have conducted for the past five years. I will end by mentioning several ongoing and future research directions that carry this statistical lens to entirely new problems in Graphics and Vision and into specific applications. Bio: Silvia is a fifth year Computer Science PhD student at the University of Toronto, working in Computer Graphics and Geometry Processing. She is a Vanier Doctoral Scholar, an Adobe Research Fellow and the winner of the 2021 University of Toronto Arts & Science Dean’s Doctoral Excellence Scholarship. She has interned twice at Adobe Research and twice at the Fields Institute of Mathematics. She is also a founder and organizer of the Toronto Geometry Colloquium and a member of WiGRAPH. CS Colloquium Speaker Speaker: Shiori Sagawa, Stanford University Date: Tuesday, March 26 Time: 12:30pm EST Location: CS 105 Host: Ellen Zhong Event page: https://www.cs.princeton.edu/events/26601 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_gqaJBw4fQFmo-8wHqUqqdA Title: Distributionally Robust Machine Learning Abstract: Machine learning models are widely deployed today, but they can fail due to distribution shifts: mismatches in the data distribution between training and deployment. Models can fail on certain subpopulations (e.g., language models can fail on non-English languages) and on new domains unseen during training (e.g., medical models can fail on new hospitals). In this talk, I will discuss my work on algorithms for improving robustness to distribution shifts. First, to mitigate subpopulation shifts, I develop methods that leverage distributionally robust optimization (DRO). My methods overcome the computational and statistical obstacles of applying DRO on modern neural networks and on real-world shifts. Second, to tackle domain shifts, I build WILDS, a benchmark of real-world shifts, and show that existing methods fail on WILDS even though they perform well on synthetic shifts from prior benchmarks. I then develop a state-of-the-art method that successfully mitigates real-world domain shifts; my method proposes an alternative to domain invariance—a key principle behind the prior methods—to reflect the structure of real-world shifts. Altogether, my algorithms improve robustness to a wide range of distribution shifts in the wild, from subpopulation shifts in language modeling to domain shifts in wildlife monitoring and histopathology. Bio: Shiori Sagawa is a final-year PhD Candidate in Computer Science at Stanford University, advised by Percy Liang. Her research focuses on algorithms for reliable machine learning. She was awarded the Stanford Graduate Fellowship and an Apple Scholars in AI/ML PhD Fellowship. Prior to her PhD, she received her B.A. in Computer Science and Molecular and Cell Biology from UC Berkeley, and she worked at D. E. Shaw Research.
CS Colloquium Speaker Speaker: Silvia Sellán, University of Toronto Date: Monday, March 25 Time: 12:30pm EST Location: CS 105 Host: Adam Finkelstein Event page: https://www.cs.princeton.edu/events/26599 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_-avKe73_RbqHneUpsPypHw Title: Stochastic Computer Graphics Abstract: Computer Graphics research has long been dominated by the interests of large film, television and social media companies, forcing other, more safety-critical applications (e.g., medicine, engineering, security) to repurpose Graphics algorithms originally designed for entertainment. In this talk, I will advocate for a perspective shift in our field that allows us to design algorithms directly for these safety-critical application realms. I will show that this begins by reinterpreting traditional Graphics tasks (e.g., 3D modeling and reconstruction) from a statistical lens and quantifying the uncertainty in our algorithmic outputs, as exemplified by the research I have conducted for the past five years. I will end by mentioning several ongoing and future research directions that carry this statistical lens to entirely new problems in Graphics and Vision and into specific applications. Bio: Silvia is a fifth year Computer Science PhD student at the University of Toronto, working in Computer Graphics and Geometry Processing. She is a Vanier Doctoral Scholar, an Adobe Research Fellow and the winner of the 2021 University of Toronto Arts & Science Dean’s Doctoral Excellence Scholarship. She has interned twice at Adobe Research and twice at the Fields Institute of Mathematics. She is also a founder and organizer of the Toronto Geometry Colloquium and a member of WiGRAPH.
CS Colloquium Speaker Speaker: Bryan Pardo, Northwestern University Date: Monday, October 7 Time: 12:30pm EST Location: CS 105 Host: Adam Finkelstein Event page: https://www.cs.princeton.edu/events/26714 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_TZ_YrTZyRu-Amzcg5uFp-Q Title: The Future is Hear: Innovations from the Interactive Audio Lab Abstract: The Interactive Audio Lab, headed by Bryan Pardo, works at the intersection of machine learning, signal processing and human-computer interaction. The lab invents new tools to generate, modify, find, separate, and label sound. In this talk, Prof. Pardo will discuss three projects illustrative of the work in the lab: Text2FX: Audio effects (e.g., equalization, reverberation, compression) are a cornerstone of modern audio production. However, their complex and unintuitive controls (e.g., decay, cutoff frequency) make them challenging for non-technical musicians, podcasters and sound artists. As people naturally describe sound in terms like `bright' or `warm,' natural language can serve as a more intuitive and accessible way to navigate the complex parameter spaces of audio effects. Text2FX leverages a shared audio-text embedding space (CLAP) and differentiable digital signal processing (DDSP) to control audio effects, such as equalization and reverberation, using open-vocabulary natural language prompts (e.g., “make it sound in-your-face and bold”). VampNet: In recent years, advances in discrete acoustic token modeling have resulted in significant leaps in autoregressive generation of speech and music. Meanwhile, approaches that use non-autoregressive parallel iterative decoding have been developed for efficient image synthesis. In this work, we combine parallel iterative decoding with acoustic token modeling and apply them to music audio synthesis. The resulting model, VampNet is fast enough for interactive performance and can be prompted by music audio prompts, making it well suited for creating loops and variational accompaniment in artistic contexts. VoiceBlock: Deep-learning-based speaker recognition systems can facilitate mass surveillance, allowing search for a target speaker through thousands of concurrent voice communications. In this work, we propose a highly-effective approach to anonymize speech to an automated speaker recognition system, while leaving the voice perceptually unaltered to a human listener. Because our method does not conceal speaker identity from human listeners, it still allows high-effort targeted surveillance (e.g. authorized human-attended wiretaps of criminal enterprises), while making mass automated surveillance significantly less reliable. In this way, we hope to return to the status quo of the 20th and early 21st centuries – in which the need for human listeners provided an important check on mass surveillance. Bio: Bryan Pardo studies fundamental problems in computer audition, content-based audio search, and generative modeling of audio, and also develops inclusive interfaces for audio production. He is head of Northwestern University’s Interactive Audio Lab and co-director of the Northwestern University Center for HCI+Design. Prof. Pardo has appointments in the Department of Computer Science and Department of Radio, Television and Film. He received a M. Mus. in Jazz Studies in 2001 and a Ph.D. in Computer Science in 2005, both from the University of Michigan. He has authored over 140 peer-reviewed publications. He has developed speech analysis software for the Speech and Hearing department of the Ohio State University, statistical software for SPSS and worked as a machine learning researcher for General Dynamics. His patented technologies have been productized by companies including Bose, Adobe, Lexi, and Ear Machine. While finishing his doctorate, he taught in the Music Department of Madonna University. When he is not teaching or researching, he performs on saxophone and clarinet with the bands Son Monarcas and The East Loop.
CS Colloquium Speaker Speaker: Bryan Pardo, Northwestern University Date: Monday, October 7 Time: 12:30pm EST Location: CS 105 Host: Adam Finkelstein Event page: https://www.cs.princeton.edu/events/26714 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_TZ_YrTZyRu-Amzcg5uFp-Q Title: The Future is Hear: Innovations from the Interactive Audio Lab Abstract: The Interactive Audio Lab, headed by Bryan Pardo, works at the intersection of machine learning, signal processing and human-computer interaction. The lab invents new tools to generate, modify, find, separate, and label sound. In this talk, Prof. Pardo will discuss three projects illustrative of the work in the lab: Text2FX: Audio effects (e.g., equalization, reverberation, compression) are a cornerstone of modern audio production. However, their complex and unintuitive controls (e.g., decay, cutoff frequency) make them challenging for non-technical musicians, podcasters and sound artists. As people naturally describe sound in terms like `bright' or `warm,' natural language can serve as a more intuitive and accessible way to navigate the complex parameter spaces of audio effects. Text2FX leverages a shared audio-text embedding space (CLAP) and differentiable digital signal processing (DDSP) to control audio effects, such as equalization and reverberation, using open-vocabulary natural language prompts (e.g., “make it sound in-your-face and bold”). VampNet: In recent years, advances in discrete acoustic token modeling have resulted in significant leaps in autoregressive generation of speech and music. Meanwhile, approaches that use non-autoregressive parallel iterative decoding have been developed for efficient image synthesis. In this work, we combine parallel iterative decoding with acoustic token modeling and apply them to music audio synthesis. The resulting model, VampNet is fast enough for interactive performance and can be prompted by music audio prompts, making it well suited for creating loops and variational accompaniment in artistic contexts. VoiceBlock: Deep-learning-based speaker recognition systems can facilitate mass surveillance, allowing search for a target speaker through thousands of concurrent voice communications. In this work, we propose a highly-effective approach to anonymize speech to an automated speaker recognition system, while leaving the voice perceptually unaltered to a human listener. Because our method does not conceal speaker identity from human listeners, it still allows high-effort targeted surveillance (e.g. authorized human-attended wiretaps of criminal enterprises), while making mass automated surveillance significantly less reliable. In this way, we hope to return to the status quo of the 20th and early 21st centuries – in which the need for human listeners provided an important check on mass surveillance. Bio: Bryan Pardo studies fundamental problems in computer audition, content-based audio search, and generative modeling of audio, and also develops inclusive interfaces for audio production. He is head of Northwestern University’s Interactive Audio Lab and co-director of the Northwestern University Center for HCI+Design. Prof. Pardo has appointments in the Department of Computer Science and Department of Radio, Television and Film. He received a M. Mus. in Jazz Studies in 2001 and a Ph.D. in Computer Science in 2005, both from the University of Michigan. He has authored over 140 peer-reviewed publications. He has developed speech analysis software for the Speech and Hearing department of the Ohio State University, statistical software for SPSS and worked as a machine learning researcher for General Dynamics. His patented technologies have been productized by companies including Bose, Adobe, Lexi, and Ear Machine. While finishing his doctorate, he taught in the Music Department of Madonna University. When he is not teaching or researching, he performs on saxophone and clarinet with the bands Son Monarcas and The East Loop.
CS Colloquium Speaker Speaker: Jeffrey Bigham, Carnegie Mellon University Date: Monday, November 4 Time: 12:30pm EST Location: CS 105 Host: Andrés Monroy-Hernández Event page: https://www.cs.princeton.edu/events/26736 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_rsGVIJvfSfuDNAZVKJ5Dlg Title: How Easy Access to Statistical Likelihoods of Everything Will Change Interaction with Computers Abstract: The recent arrival of impressive large language models and coding assistants has led to speculation that the way we interact with computers would dramatically (and quickly!) change. That hasn’t really happened… yet, but we are at an inflection point where we can influence interaction for both better and, potentially, worse. In this talk, I’ll use examples from our research to highlight four coming challenges and opportunities in how we interact with computers in (i) maintaining user agency, (ii) designing user interfaces that encourage responsibility, (iii) making computer systems accessible, and (iv) designing, generating, and navigating user interfaces automatically. The future of human-computer interaction will be both more familiar and less familiar than we think; this talk is intended to help develop your sense of what is likely to be and which futures you want to build. Bio: Jeffrey P. Bigham is an Associate Professor in the Human-Computer Interaction and Language Technologies Institutes in the School of Computer Science at Carnegie Mellon University, and the Director of Human-Centered Machine Learning within AIML at Apple. He builds systems that advance how people can responsibly work with machine learning to do interesting and useful things. This has taken on a variety of focuses throughout his career – he has worked on applications in accessibility for disabilities, systems that used crowdsourcing to power a wide variety of real-time experiences, and most recently on how we can design responsible and useful experiences using generative AI. Much of his work has focused on accessibility because he sees the field as a window into the future, given that people with disabilities are often the earliest adopters of AI. Bigham received his B.S.E degree in Computer Science from Princeton University in 2003, and his Ph.D. in Computer Science and Engineering from the University of Washington in 2009.
CS Colloquium Speaker Speaker: Jeffrey Bigham, Carnegie Mellon University Date: Monday, November 4 Time: 12:30pm EST Location: CS 105 Host: Andrés Monroy-Hernández Event page: https://www.cs.princeton.edu/events/26736 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_rsGVIJvfSfuDNAZVKJ5Dlg Title: How Easy Access to Statistical Likelihoods of Everything Will Change Interaction with Computers Abstract: The recent arrival of impressive large language models and coding assistants has led to speculation that the way we interact with computers would dramatically (and quickly!) change. That hasn’t really happened… yet, but we are at an inflection point where we can influence interaction for both better and, potentially, worse. In this talk, I’ll use examples from our research to highlight four coming challenges and opportunities in how we interact with computers in (i) maintaining user agency, (ii) designing user interfaces that encourage responsibility, (iii) making computer systems accessible, and (iv) designing, generating, and navigating user interfaces automatically. The future of human-computer interaction will be both more familiar and less familiar than we think; this talk is intended to help develop your sense of what is likely to be and which futures you want to build. Bio: Jeffrey P. Bigham is an Associate Professor in the Human-Computer Interaction and Language Technologies Institutes in the School of Computer Science at Carnegie Mellon University, and the Director of Human-Centered Machine Learning within AIML at Apple. He builds systems that advance how people can responsibly work with machine learning to do interesting and useful things. This has taken on a variety of focuses throughout his career – he has worked on applications in accessibility for disabilities, systems that used crowdsourcing to power a wide variety of real-time experiences, and most recently on how we can design responsible and useful experiences using generative AI. Much of his work has focused on accessibility because he sees the field as a window into the future, given that people with disabilities are often the earliest adopters of AI. Bigham received his B.S.E degree in Computer Science from Princeton University in 2003, and his Ph.D. in Computer Science and Engineering from the University of Washington in 2009.
CS Colloquium Speaker Speaker: Jian Ma, Carnegie Mellon University Date: Tuesday, December 3 Time: 12:30pm EST Location: CS 105 Host: Yuri Pritykin Event page: https://www.cs.princeton.edu/events/26752 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_qpPNLsEeRPakpWWB91a_og Title: Learning Multiscale Genome and Cellular Organization Abstract: Despite significant advancements in high-throughput data acquisition in genomics and cell biology, our understanding of the diverse cell types within the human body remains limited. The principles governing intracellular molecular spatial organization and interactions, as well as cellular spatial organization within complex tissues, are still largely unclear. A major challenge lies in developing computational methods capable of integrating heterogeneous, multiscale molecular, cellular, and tissue information. In this talk, I will discuss our work on developing machine learning approaches to advance regulatory genomics through single-cell 3D epigenomics. Additionally, I will introduce our recent efforts in creating interpretable, self-supervised models for the multiscale delineation of cellular interactions in tissues. These methods hold the potential to reveal new insights into fundamental genome structure, gene regulation, and cellular function across a wide range of biological contexts in both health and disease. Bio: Jian Ma is the Ray and Stephanie Lane Professor of Computational Biology in the School of Computer Science at Carnegie Mellon University. He leads a research group dedicated to developing advanced AI/ML methods for exploring the structural and functional complexity of the human genome and cellular organization. He recently founded the Center for AI-Driven Biomedical Research (AI4BIO) at CMU, which aims to advance AI/ML development for decoding the molecular language governing cellular behavior. He serves as the Contact PI for a Center grant in the NIH 4D Nucleome Program and as Co-Chair of its Steering Committee. He is also a member of the Scientific Advisory Board of the Chan Zuckerberg Biohub Chicago and the RECOMB Steering Committee. His contributions have earned him several honors, including an NSF CAREER Award, a Guggenheim Fellowship, and election as a Fellow of the American Association for the Advancement of Science (AAAS).
CS Colloquium Speaker Speaker: Jian Ma, Carnegie Mellon University Date: Tuesday, December 3 Time: 12:30pm EST Location: CS 105 Host: Yuri Pritykin Event page: https://www.cs.princeton.edu/events/26752 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_qpPNLsEeRPakpWWB91a_og Title: Learning Multiscale Genome and Cellular Organization Abstract: Despite significant advancements in high-throughput data acquisition in genomics and cell biology, our understanding of the diverse cell types within the human body remains limited. The principles governing intracellular molecular spatial organization and interactions, as well as cellular spatial organization within complex tissues, are still largely unclear. A major challenge lies in developing computational methods capable of integrating heterogeneous, multiscale molecular, cellular, and tissue information. In this talk, I will discuss our work on developing machine learning approaches to advance regulatory genomics through single-cell 3D epigenomics. Additionally, I will introduce our recent efforts in creating interpretable, self-supervised models for the multiscale delineation of cellular interactions in tissues. These methods hold the potential to reveal new insights into fundamental genome structure, gene regulation, and cellular function across a wide range of biological contexts in both health and disease. Bio: Jian Ma is the Ray and Stephanie Lane Professor of Computational Biology in the School of Computer Science at Carnegie Mellon University. He leads a research group dedicated to developing advanced AI/ML methods for exploring the structural and functional complexity of the human genome and cellular organization. He recently founded the Center for AI-Driven Biomedical Research (AI4BIO) at CMU, which aims to advance AI/ML development for decoding the molecular language governing cellular behavior. He serves as the Contact PI for a Center grant in the NIH 4D Nucleome Program and as Co-Chair of its Steering Committee. He is also a member of the Scientific Advisory Board of the Chan Zuckerberg Biohub Chicago and the RECOMB Steering Committee. His contributions have earned him several honors, including an NSF CAREER Award, a Guggenheim Fellowship, and election as a Fellow of the American Association for the Advancement of Science (AAAS).
CS Colloquium Speaker Speaker: Shiori Sagawa, Stanford University Date: Tuesday, March 26 Time: 12:30pm EST Location: CS 105 Host: Ellen Zhong Event page: https://www.cs.princeton.edu/events/26601 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_gqaJBw4fQFmo-8wHqUqqdA Title: Distributionally Robust Machine Learning Abstract: Machine learning models are widely deployed today, but they can fail due to distribution shifts: mismatches in the data distribution between training and deployment. Models can fail on certain subpopulations (e.g., language models can fail on non-English languages) and on new domains unseen during training (e.g., medical models can fail on new hospitals). In this talk, I will discuss my work on algorithms for improving robustness to distribution shifts. First, to mitigate subpopulation shifts, I develop methods that leverage distributionally robust optimization (DRO). My methods overcome the computational and statistical obstacles of applying DRO on modern neural networks and on real-world shifts. Second, to tackle domain shifts, I build WILDS, a benchmark of real-world shifts, and show that existing methods fail on WILDS even though they perform well on synthetic shifts from prior benchmarks. I then develop a state-of-the-art method that successfully mitigates real-world domain shifts; my method proposes an alternative to domain invariance—a key principle behind the prior methods—to reflect the structure of real-world shifts. Altogether, my algorithms improve robustness to a wide range of distribution shifts in the wild, from subpopulation shifts in language modeling to domain shifts in wildlife monitoring and histopathology. Bio: Shiori Sagawa is a final-year PhD Candidate in Computer Science at Stanford University, advised by Percy Liang. Her research focuses on algorithms for reliable machine learning. She was awarded the Stanford Graduate Fellowship and an Apple Scholars in AI/ML PhD Fellowship. Prior to her PhD, she received her B.A. in Computer Science and Molecular and Cell Biology from UC Berkeley, and she worked at D. E. Shaw Research.
CS Colloquium Speaker Speaker: Kianté Brantley, Cornell University Date: Monday, April 1 Time: 12:30pm EST Location: CS 105 Host: Ryan Adams Event page: https://www.cs.princeton.edu/events/26611 Register for live-stream online here: [ https://princeton.zoom.us/webinar/register/WN_UxpKpyaqSoimB7SFm9qnrA | https://princeton.zoom.us/webinar/register/WN_UxpKpyaqSoimB7SFm9qnrA ] Title: Learning from Interaction Abstract: Machine learning systems have seen advancements due to large models pre-trained on vast amounts of data. These pre-trained models have led to progress on various downstream tasks when fine-tuned. However, for machine learning systems to function in real-world environments, they must overcome certain challenges that are not influenced by model or dataset sizes. One potential solution is to fine-tune machine learning models based on online interactions. In this talk, I will present my research on developing natural language processing systems that learn from interacting in an environment. I will begin by describing the issues that arise when systems are trained on offline data and then deployed in interactive environments. Additionally, I will present an algorithm that addresses these issues using only environmental interaction without additional supervision. Moreover, I will demonstrate how learning from interaction can improve natural language processing systems. Finally, I will present a set of new interactive learning algorithms explicitly designed for natural language processing systems. Bio: Kianté Brantley is a Postdoctoral Associate in the Department of Computer Science at Cornell University., working with Thorsten Joachims. He completed his Ph.D. in Computer Science at the University of Maryland College Park, advised by Dr. Hal Daumé III. His research focuses on developing machine learning models that can make automated decisions in the real world with minimal supervision. His research lies at the intersection of imitation learning, reinforcement learning, and natural language processing. He is a recipient of the NSF LSAMP BD Fellowship, ACM SIGHPC Computational and Data Science Fellowship, Microsoft Dissertation Research Grant, Ann G. Wylie Dissertation Fellowship, and NSF CIFellow Postdoctoral Fellowship. CS Colloquium Speaker Speaker: Saksham Agarwal, Cornell University Date: Tuesday, April 2 Time: 12:30pm EST Location: CS 105 Host: Wyatt Lloyd Event page: https://www.cs.princeton.edu/events/26606 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_3C4sOdgeSeK8v3P9IdNncg Title: The Host Network (and its implications to network protocols, OS and hardware) Abstract: The host network enables data transfers within hosts, and forms the “last mile” for data transfers across hosts for distributed applications. This talk will reflect on my (ongoing) journey that started with a surprising phenomenon observed in a lab experiment—nanosecond-scale inefficiencies within the host network percolating through network protocols and OS to create millisecond-scale impact on distributed applications. I will discuss my work on understanding, characterizing, and resolving the above phenomenon in the lab and in production clusters. I will also discuss how this phenomenon opens up intriguing research questions at the intersection of computer networking, OS and architecture. Bio: Saksham Agarwal is a PhD student in the Computer Science department at Cornell University, advised by Prof. Rachit Agarwal. He did his undergraduate studies at IIT Kanpur. He is a recipient of Google PhD Fellowship, Cornell University Fellowship, a SIGCOMM Best Student Paper Award, and a Cornell CS Outstanding TA Award. CS Colloquium Speaker Speaker: Sagar Karandikar, University of California, Berkeley Date: Wednesday, April 3 Time: 12:30pm EST Location: CS 105 Host: Margaret Martonosi Event page: https://www.cs.princeton.edu/events/26592 Register for live-stream online here: [ https://princeton.zoom.us/webinar/register/WN_ulF5FP8GSX6sV8HIYB4v1g | https://princeton.zoom.us/webinar/register/WN_ulF5FP8GSX6sV8HIYB4v1g ] *Live stream is available to Princeton University ID holders only. Title: Catch M(oor)e If You Can: Agile Hardware/Software Co-Design for Hyperscale Cloud Systems Abstract: Global reliance on cloud services, powered by transformative technologies like generative AI, machine learning, and big-data analytics, is driving exponential growth in demand for hyperscale cloud compute infrastructure. Meanwhile, the breakdown of classical hardware scaling (e.g., Moore's Law) is hampering growth in compute supply. Building domain-specific hardware can address this supply-demand gap, but catching up with exponential demand requires developing new hardware rapidly and with confidence that performance/efficiency gains will compound in the context of a complete system. These are challenging tasks given the status quo in hardware design, even before accounting for the immense scale of cloud systems. This talk will focus on two themes of my work: (1) Developing radical new agile, end-to-end hardware/software co-design tools that challenge the status quo in hardware design for systems of all scales and unlock the ability to innovate on new hardware at datacenter scale. (2) Leveraging these tools and insights from hyperscale datacenter fleet profiling to architect and implement state-of-the-art domain-specific hardware that addresses key efficiency challenges in hyperscale cloud systems. I will first cover my work creating the award-winning and widely used FireSim FPGA-accelerated hardware simulation platform, which provides unprecedented hardware/software co-design capabilities. FireSim automatically constructs high-performance, cycle-exact, scale-out simulations of novel hardware designs derived from the tapeout-friendly RTL code that describes them, empowering hardware designers and domain experts alike to directly iterate on new hardware designs in hours rather than years. FireSim also unlocks innovation in datacenter hardware with the unparalleled ability to scale to massive, distributed simulations of thousand-node networked datacenter clusters with specialized server designs and complete control over the datacenter architecture. I will then briefly cover my work co-creating the also widely used Chipyard platform for agile construction, simulation (including FireSim), and tape-out of specialized RISC-V System-on-Chip (SoC) designs using a novel, RTL-generator-driven approach. Next, I will discuss my work in collaboration with Google on Hyperscale SoC, a cloud-optimized server chip built, evaluated, and taped-out with FireSim and Chipyard. Hyperscale SoC includes my work on several novel domain-specific accelerators (DSAs) for expensive but foundational operations in hyperscale servers, including (de)serialization, (de)compression, and more. Hyperscale SoC demonstrates a new paradigm of data-driven, end-to-end hardware/software co-design, combining key insights from profiling Google's world-wide datacenter fleet with the ability to rapidly build and evaluate novel hardware designs in FireSim/Chipyard. This instance of Hyperscale SoC is just the beginning; I will conclude by covering the wide-ranging opportunities that can now be explored for radically redesigning next generation hyperscale cloud datacenters. Bio: Sagar Karandikar is a Ph.D. Candidate at UC Berkeley and a Student Researcher at Google. His work broadly focuses on co-designing hardware and software to build next generation hyperscale cloud systems. He is also interested in agile, open-source hardware development methodologies. His first-author publications have received several honors, including being selected for the ISCA@50 25-year Retrospective, as an IEEE Micro Top Pick, as an IEEE Micro Top Pick Honorable Mention, and as the MICRO '21 Distinguished Artifact Award winner. He created and leads the FireSim project, which has been used as a foundational research platform in over 50 peer-reviewed publications from first authors at over 20 institutions. FireSim has also been used in the development of commercially available chips and as a standard host platform for DARPA and IARPA programs. He is a co-creator and co-lead of the also widely used Chipyard RISC-V System-on-Chip (SoC) development platform. His work on Hyperscale SoC has been influential at Google and more broadly across other silicon vendors. He was selected as a 2022 DARPA Riser and received the UC Berkeley Outstanding Graduate Student Instructor (TA) Award. He received his M.S. and B.S. from UC Berkeley. CS Colloquium Speaker Speaker: Yilun Du, Massachusetts Institute of Technology Date: Thursday, April 4 Time: 12:30pm EST Location: CS 105 Host: Felix Heide Event page: https://www.cs.princeton.edu/events/26595 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_6ikSJRvFQb-ywv0jmzKgxA Title: Generalizing Beyond the Training Distribution through Compositional Generation Abstract: Generative AI has led to stunning successes in recent years but is fundamentally limited by the amount of data available. This is especially limiting in the embodied setting – where an agent must solve new tasks in new environments. In this talk, I’ll introduce the idea of compositional generative modeling, which enables generalization beyond the training data by building complex generative models from smaller constituents. I’ll first introduce the idea of energy-based models and illustrate how they enable compositional generative modeling. I’ll then illustrate how such compositional models enable us to synthesize complex plans for unseen tasks at inference time. Finally, I'll show how such compositionality can be applied to multiple foundation models trained on various forms of Internet data, enabling us to construct decision-making systems that can hierarchically plan and solve long-horizon problems in a zero-shot manner. Bio: Yilun Du is final year PhD student at MIT CSAIL advised by Leslie Kaelbling, Tomas Lozano-Perez and Joshua Tenenbaum. His research spans the fields of machine learning and robotics, with a focus on generative models. He is supported by the NSFGraduate Research Fellowship and was previously a research fellow at OpenAI, a visiting researcher at FAIR and a student researcher at Google Deepmind.
CS Colloquium Speaker Speaker: Kianté Brantley, Cornell University Date: Monday, April 1 Time: 12:30pm EST Location: CS 105 Host: Ryan Adams Event page: https://www.cs.princeton.edu/events/26611 Register for live-stream online here: [ https://princeton.zoom.us/webinar/register/WN_UxpKpyaqSoimB7SFm9qnrA | https://princeton.zoom.us/webinar/register/WN_UxpKpyaqSoimB7SFm9qnrA ] Title: Learning from Interaction Abstract: Machine learning systems have seen advancements due to large models pre-trained on vast amounts of data. These pre-trained models have led to progress on various downstream tasks when fine-tuned. However, for machine learning systems to function in real-world environments, they must overcome certain challenges that are not influenced by model or dataset sizes. One potential solution is to fine-tune machine learning models based on online interactions. In this talk, I will present my research on developing natural language processing systems that learn from interacting in an environment. I will begin by describing the issues that arise when systems are trained on offline data and then deployed in interactive environments. Additionally, I will present an algorithm that addresses these issues using only environmental interaction without additional supervision. Moreover, I will demonstrate how learning from interaction can improve natural language processing systems. Finally, I will present a set of new interactive learning algorithms explicitly designed for natural language processing systems. Bio: Kianté Brantley is a Postdoctoral Associate in the Department of Computer Science at Cornell University., working with Thorsten Joachims. He completed his Ph.D. in Computer Science at the University of Maryland College Park, advised by Dr. Hal Daumé III. His research focuses on developing machine learning models that can make automated decisions in the real world with minimal supervision. His research lies at the intersection of imitation learning, reinforcement learning, and natural language processing. He is a recipient of the NSF LSAMP BD Fellowship, ACM SIGHPC Computational and Data Science Fellowship, Microsoft Dissertation Research Grant, Ann G. Wylie Dissertation Fellowship, and NSF CIFellow Postdoctoral Fellowship.
CS Colloquium Speaker Speaker: Saksham Agarwal, Cornell University Date: Tuesday, April 2 Time: 12:30pm EST Location: CS 105 Host: Wyatt Lloyd Event page: https://www.cs.princeton.edu/events/26606 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_3C4sOdgeSeK8v3P9IdNncg Title: The Host Network (and its implications to network protocols, OS and hardware) Abstract: The host network enables data transfers within hosts, and forms the “last mile” for data transfers across hosts for distributed applications. This talk will reflect on my (ongoing) journey that started with a surprising phenomenon observed in a lab experiment—nanosecond-scale inefficiencies within the host network percolating through network protocols and OS to create millisecond-scale impact on distributed applications. I will discuss my work on understanding, characterizing, and resolving the above phenomenon in the lab and in production clusters. I will also discuss how this phenomenon opens up intriguing research questions at the intersection of computer networking, OS and architecture. Bio: Saksham Agarwal is a PhD student in the Computer Science department at Cornell University, advised by Prof. Rachit Agarwal. He did his undergraduate studies at IIT Kanpur. He is a recipient of Google PhD Fellowship, Cornell University Fellowship, a SIGCOMM Best Student Paper Award, and a Cornell CS Outstanding TA Award.
CS Colloquium Speaker Speaker: Sagar Karandikar, University of California, Berkeley Date: Wednesday, April 3 Time: 12:30pm EST Location: CS 105 Host: Margaret Martonosi Event page: https://www.cs.princeton.edu/events/26592 Register for live-stream online here: [ https://princeton.zoom.us/webinar/register/WN_ulF5FP8GSX6sV8HIYB4v1g | https://princeton.zoom.us/webinar/register/WN_ulF5FP8GSX6sV8HIYB4v1g ] *Live stream is available to Princeton University ID holders only. Title: Catch M(oor)e If You Can: Agile Hardware/Software Co-Design for Hyperscale Cloud Systems Abstract: Global reliance on cloud services, powered by transformative technologies like generative AI, machine learning, and big-data analytics, is driving exponential growth in demand for hyperscale cloud compute infrastructure. Meanwhile, the breakdown of classical hardware scaling (e.g., Moore's Law) is hampering growth in compute supply. Building domain-specific hardware can address this supply-demand gap, but catching up with exponential demand requires developing new hardware rapidly and with confidence that performance/efficiency gains will compound in the context of a complete system. These are challenging tasks given the status quo in hardware design, even before accounting for the immense scale of cloud systems. This talk will focus on two themes of my work: (1) Developing radical new agile, end-to-end hardware/software co-design tools that challenge the status quo in hardware design for systems of all scales and unlock the ability to innovate on new hardware at datacenter scale. (2) Leveraging these tools and insights from hyperscale datacenter fleet profiling to architect and implement state-of-the-art domain-specific hardware that addresses key efficiency challenges in hyperscale cloud systems. I will first cover my work creating the award-winning and widely used FireSim FPGA-accelerated hardware simulation platform, which provides unprecedented hardware/software co-design capabilities. FireSim automatically constructs high-performance, cycle-exact, scale-out simulations of novel hardware designs derived from the tapeout-friendly RTL code that describes them, empowering hardware designers and domain experts alike to directly iterate on new hardware designs in hours rather than years. FireSim also unlocks innovation in datacenter hardware with the unparalleled ability to scale to massive, distributed simulations of thousand-node networked datacenter clusters with specialized server designs and complete control over the datacenter architecture. I will then briefly cover my work co-creating the also widely used Chipyard platform for agile construction, simulation (including FireSim), and tape-out of specialized RISC-V System-on-Chip (SoC) designs using a novel, RTL-generator-driven approach. Next, I will discuss my work in collaboration with Google on Hyperscale SoC, a cloud-optimized server chip built, evaluated, and taped-out with FireSim and Chipyard. Hyperscale SoC includes my work on several novel domain-specific accelerators (DSAs) for expensive but foundational operations in hyperscale servers, including (de)serialization, (de)compression, and more. Hyperscale SoC demonstrates a new paradigm of data-driven, end-to-end hardware/software co-design, combining key insights from profiling Google's world-wide datacenter fleet with the ability to rapidly build and evaluate novel hardware designs in FireSim/Chipyard. This instance of Hyperscale SoC is just the beginning; I will conclude by covering the wide-ranging opportunities that can now be explored for radically redesigning next generation hyperscale cloud datacenters. Bio: Sagar Karandikar is a Ph.D. Candidate at UC Berkeley and a Student Researcher at Google. His work broadly focuses on co-designing hardware and software to build next generation hyperscale cloud systems. He is also interested in agile, open-source hardware development methodologies. His first-author publications have received several honors, including being selected for the ISCA@50 25-year Retrospective, as an IEEE Micro Top Pick, as an IEEE Micro Top Pick Honorable Mention, and as the MICRO '21 Distinguished Artifact Award winner. He created and leads the FireSim project, which has been used as a foundational research platform in over 50 peer-reviewed publications from first authors at over 20 institutions. FireSim has also been used in the development of commercially available chips and as a standard host platform for DARPA and IARPA programs. He is a co-creator and co-lead of the also widely used Chipyard RISC-V System-on-Chip (SoC) development platform. His work on Hyperscale SoC has been influential at Google and more broadly across other silicon vendors. He was selected as a 2022 DARPA Riser and received the UC Berkeley Outstanding Graduate Student Instructor (TA) Award. He received his M.S. and B.S. from UC Berkeley.
CS Colloquium Speaker Speaker: Yilun Du, Massachusetts Institute of Technology Date: Thursday, April 4 Time: 12:30pm EST Location: CS 105 Host: Felix Heide Event page: https://www.cs.princeton.edu/events/26595 Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_6ikSJRvFQb-ywv0jmzKgxA Title: Generalizing Beyond the Training Distribution through Compositional Generation Abstract: Generative AI has led to stunning successes in recent years but is fundamentally limited by the amount of data available. This is especially limiting in the embodied setting – where an agent must solve new tasks in new environments. In this talk, I’ll introduce the idea of compositional generative modeling, which enables generalization beyond the training data by building complex generative models from smaller constituents. I’ll first introduce the idea of energy-based models and illustrate how they enable compositional generative modeling. I’ll then illustrate how such compositional models enable us to synthesize complex plans for unseen tasks at inference time. Finally, I'll show how such compositionality can be applied to multiple foundation models trained on various forms of Internet data, enabling us to construct decision-making systems that can hierarchically plan and solve long-horizon problems in a zero-shot manner. Bio: Yilun Du is final year PhD student at MIT CSAIL advised by Leslie Kaelbling, Tomas Lozano-Perez and Joshua Tenenbaum. His research spans the fields of machine learning and robotics, with a focus on generative models. He is supported by the NSFGraduate Research Fellowship and was previously a research fellow at OpenAI, a visiting researcher at FAIR and a student researcher at Google Deepmind.
CS Colloquium Speaker Speaker: Zhuang Liu, Meta AI Research Date: Tuesday, April 9 Time: 12:30pm EST Location: CS 105 Host: Jia Deng Event page: [ https://www.cs.princeton.edu/events/26580 | https://www.cs.princeton.edu/events/26580 ] Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_S-u-GtwMT1Sn1R3nU-jODQ Title: Scaling Deep Learning Up and Down Abstract: Deep learning with neural networks has emerged as a key approach for discovering patterns and modeling relationships in complex data. AI systems powered by deep learning are used widely in applications across a broad spectrum of scales. There have been strong needs for scaling deep learning both upward and downward. Scaling up highlights the pursuit of scalability - the ability to utilize increasingly abundant computing and data resources to achieve superior capabilities, overcoming diminishing returns. Scaling down represents the demand for efficiency - there is limited data for many application domains, and deployment is often in compute-limited settings. My research focuses on scaling deep learning both up and down, to build capable models and understand their behaviors in different computational and data environments. In this talk, we present studies in both directions. For scaling up, we first explore the design of scalable neural network architectures that are widely adopted in various fields. We then discuss an intriguing observation on modern vision datasets and its implication on scaling training data. For scaling down, we introduce simple, effective, and popularly used approaches for compressing convolutional networks and large language models, alongside interesting empirical findings. Notably, a recurring theme in this talk is the careful examination of implicit assumptions in the literature, which often leads to surprising revelations that reshape community understanding. Finally, we discuss exciting avenues for future deep learning and vision research, such as developing next-gen architectures and modeling datasets. Bio: Zhuang Liu is currently a Research Scientist at Meta AI Research (FAIR) in New York City. He received his Ph.D. from UC Berkeley EECS in 2022, advised by Trevor Darrell. His research areas include deep learning and computer vision. His work focuses on scaling neural networks both up and down, to build capable models and understand their behaviors in different computational and data environments. His work is broadly applied in different areas of computing and other disciplines. He is a recipient of the CVPR 2017 Best Paper Award.
CS Colloquium Speaker Speaker: Zhuang Liu, Meta AI Research Date: Tuesday, April 9 Time: 12:30pm EST Location: CS 105 Host: Jia Deng Event page: [ https://www.cs.princeton.edu/events/26580 | https://www.cs.princeton.edu/events/26580 ] Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_S-u-GtwMT1Sn1R3nU-jODQ Title: Scaling Deep Learning Up and Down Abstract: Deep learning with neural networks has emerged as a key approach for discovering patterns and modeling relationships in complex data. AI systems powered by deep learning are used widely in applications across a broad spectrum of scales. There have been strong needs for scaling deep learning both upward and downward. Scaling up highlights the pursuit of scalability - the ability to utilize increasingly abundant computing and data resources to achieve superior capabilities, overcoming diminishing returns. Scaling down represents the demand for efficiency - there is limited data for many application domains, and deployment is often in compute-limited settings. My research focuses on scaling deep learning both up and down, to build capable models and understand their behaviors in different computational and data environments. In this talk, we present studies in both directions. For scaling up, we first explore the design of scalable neural network architectures that are widely adopted in various fields. We then discuss an intriguing observation on modern vision datasets and its implication on scaling training data. For scaling down, we introduce simple, effective, and popularly used approaches for compressing convolutional networks and large language models, alongside interesting empirical findings. Notably, a recurring theme in this talk is the careful examination of implicit assumptions in the literature, which often leads to surprising revelations that reshape community understanding. Finally, we discuss exciting avenues for future deep learning and vision research, such as developing next-gen architectures and modeling datasets. Bio: Zhuang Liu is currently a Research Scientist at Meta AI Research (FAIR) in New York City. He received his Ph.D. from UC Berkeley EECS in 2022, advised by Trevor Darrell. His research areas include deep learning and computer vision. His work focuses on scaling neural networks both up and down, to build capable models and understand their behaviors in different computational and data environments. His work is broadly applied in different areas of computing and other disciplines. He is a recipient of the CVPR 2017 Best Paper Award.
participants (1)
-
Emily C. Lawrence