CS Colloquium speakers

Speaker: Haoshu Fang, Massachusetts Institute of Technology

Date: Tuesday, April 1

Time: 12:30pm EST

Location: CS 105

Host: Ben Eysenbach

Event page: https://www.cs.princeton.edu/events/science-data-human-level-robotic-manipulation

Title: Science of Data for Human-Level Robotic Manipulation

Abstract: Machine learning has revolutionized many subfields of robotics, from visual perception to task planning. However, the fundamental challenge of low-level motor control for object manipulation with raw sensory observations remains unresolved, primarily due to the lack of robot state and action data during manipulation. This issue is particularly pronounced in tasks requiring multi-finger coordination and fine tactile sensing.

Addressing the data problem is essential, as many modalities of robotic data, such as tactile and proprioceptive information, are not readily available online. The key scientific questions in this domain are: (i) how to collect data, (ii) what data to collect, and (iii) how to learn effectively from such data.

In this talk, I will (i) introduce a novel paradigm for data collection through the design of innovative interaction interfaces, (ii) demonstrate how identifying key dimensions for scaling data can enable human-level robotic grasping, and (iii) present methods and insights on efficiently learning from heterogeneous robotic data.

Bio: Haoshu Fang is a Postdoctoral Researcher at MIT CSAIL, working with Pulkit Agrawal and Edward Adelson. He earned his PhD from Shanghai Jiao Tong University. Haoshu's research focuses on general robotic manipulation, addressing the data challenge by designing data-centric hardware, leveraging data scaling laws, and developing data-efficient learning methods. His work has been recognized with three best paper or nomination awards at top robotics conferences and prestigious fellowships from Microsoft, Baidu, and ByteDance.

Speaker: Shishir Patil, University of California, Berkeley

Date: Thursday, April 3

Time: 12:30pm EST

Location: CS 105

Host: Ravi Netravali

Event page: https://www.cs.princeton.edu/events/agenticsystems-teaching-llms-use-tools-scale

Title: AgenticSystems: Teaching LLMs to Use Tools at Scale

Abstract: In this talk, I will present our vision of AgenticSystems, and introduce our innovative approach in integrating Large Language Models (LLMs) with various tools via APIs. Connecting LLMs with APIs presents a significant challenge, as these models often struggle to generate precise input arguments and are prone to hallucinating API calls. To address this, we developed Gorilla LLM, trained with our novel Retriever-Aware-Training (RAT), setting a new benchmark for tool-use in LLMs. Gorilla also introduces a programming language-inspired metric to quantify hallucinations, a common issue in LLMs. I will conclude by presenting GoEx, a runtime to execute actions generated by LLMs —such as code and API calls—across agents, workflows, and LLM-powered microservices. A key innovation in GoEx is the incorporation of "undo" and "damage confinement" abstractions to mitigate unintended actions and risks.

The Gorilla project kick-started tool-calling in LLMs, and with millions of user requests, widespread enterprise adoption—including all leading LLM labs—and a thriving open-source community, the Gorilla project continues to shape the evolving field of tool-calling for agentic LLMs.

Bio: Shishir G. Patil is a PhD from UC Berkeley where he was advised by Joseph Gonzalez, Prabal Dutta, and Ion Stoica. He is interested in designing and building efficient machine-learning systems. Recently, his focus has been on teaching LLMs to use tools through API calls. His works include Gorilla LLM, RAFT, OpenFunctions, Berkeley Function Calling Leaderboard, Skyplane, and POET. He was a Research Fellow at Microsoft Research before starting his PhD.