
CS Colloquium speakers Speaker: Haoshu Fang, Massachusetts Institute of Technology Date: Tuesday, April 1 Time: 12:30pm EST Location: CS 105 Host: Ben Eysenbach Event page: https://www.cs.princeton.edu/events/science-data-human-level-robotic-manipul... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_i8siw7H2ROmf0g3f9JlGTg Title: Science of Data for Human-Level Robotic Manipulation Abstract: Machine learning has revolutionized many subfields of robotics, from visual perception to task planning. However, the fundamental challenge of low-level motor control for object manipulation with raw sensory observations remains unresolved, primarily due to the lack of robot state and action data during manipulation. This issue is particularly pronounced in tasks requiring multi-finger coordination and fine tactile sensing. Addressing the data problem is essential, as many modalities of robotic data, such as tactile and proprioceptive information, are not readily available online. The key scientific questions in this domain are: (i) how to collect data, (ii) what data to collect, and (iii) how to learn effectively from such data. In this talk, I will (i) introduce a novel paradigm for data collection through the design of innovative interaction interfaces, (ii) demonstrate how identifying key dimensions for scaling data can enable human-level robotic grasping, and (iii) present methods and insights on efficiently learning from heterogeneous robotic data. Bio: Haoshu Fang is a Postdoctoral Researcher at MIT CSAIL, working with Pulkit Agrawal and Edward Adelson. He earned his PhD from Shanghai Jiao Tong University. Haoshu's research focuses on general robotic manipulation, addressing the data challenge by designing data-centric hardware, leveraging data scaling laws, and developing data-efficient learning methods. His work has been recognized with three best paper or nomination awards at top robotics conferences and prestigious fellowships from Microsoft, Baidu, and ByteDance. Speaker: Shishir Patil, University of California, Berkeley Date: Thursday, April 3 Time: 12:30pm EST Location: CS 105 Host: Ravi Netravali Event page: https://www.cs.princeton.edu/events/agenticsystems-teaching-llms-use-tools-s... Register for live-stream online here: https://princeton.zoom.us/webinar/register/WN_RXiSP_bZThmG6agRH4trNA Title: AgenticSystems: Teaching LLMs to Use Tools at Scale Abstract: In this talk, I will present our vision of AgenticSystems, and introduce our innovative approach in integrating Large Language Models (LLMs) with various tools via APIs. Connecting LLMs with APIs presents a significant challenge, as these models often struggle to generate precise input arguments and are prone to hallucinating API calls. To address this, we developed Gorilla LLM, trained with our novel Retriever-Aware-Training (RAT), setting a new benchmark for tool-use in LLMs. Gorilla also introduces a programming language-inspired metric to quantify hallucinations, a common issue in LLMs. I will conclude by presenting GoEx, a runtime to execute actions generated by LLMs —such as code and API calls—across agents, workflows, and LLM-powered microservices. A key innovation in GoEx is the incorporation of "undo" and "damage confinement" abstractions to mitigate unintended actions and risks. The Gorilla project kick-started tool-calling in LLMs, and with millions of user requests, widespread enterprise adoption—including all leading LLM labs—and a thriving open-source community, the Gorilla project continues to shape the evolving field of tool-calling for agentic LLMs. Bio: Shishir G. Patil is a PhD from UC Berkeley where he was advised by Joseph Gonzalez, Prabal Dutta, and Ion Stoica. He is interested in designing and building efficient machine-learning systems. Recently, his focus has been on teaching LLMs to use tools through API calls. His works include Gorilla LLM, RAFT, OpenFunctions, Berkeley Function Calling Leaderboard, Skyplane, and POET. He was a Research Fellow at Microsoft Research before starting his PhD.