[alg-ml-reading-group] Alg-ML Talk by Akshay Krishnamurthy on May 12, Monday, from 3pm - 4pm ET

Dr. Akshay Krishnamurthy from Microsoft Research will present his recent work on Understanding Inference Time Compute: Self-Improvement and Scaling . The talk is scheduled on on May 12, Monday. Bio: Akshay Krishnamurthy is a senior principal research manager at Microsoft Research, New York City. He previously spent two years as an assistant professor in the College of Information and Computer Sciences at the University of Massachusetts, Amherst and a year as a postdoctoral researcher at Microsoft Research, NYC. He completed my PhD in the Computer Science Department at Carnegie Mellon University, advised by Aarti Singh and received his undergraduate degree in EECS at UC Berkeley. His research interests are broadly in the areas of machine learning and statistics. He is most excited about interactive learning and decision making, including reinforcement learning, and recently has been thinking about how reinforcement learning can be used to improve modern generative AI systems. Feel free to grab a slot [ https://docs.google.com/spreadsheets/d/10LQsvqhbxw36MDLcxpnDPcG12nsrDl33yG0F... | here ] to meet with our speaker! The talk will start at 3.00 pm ET. ____________________________________________________________________________ Time: 3 .00 PM ET on May 12 , Monday Physical Location: COS 402 ____________________________________________________________________________ Title: Understanding Inference Time Compute: Self-Improvement and Scaling Abstract: Inference-time compute has emerged as a new axis for scaling large language models, leading to breakthroughs in AI reasoning. Broadly speaking, inference-time compute methods involve allowing the language model to interact with a verifier to search for desirable, high-quality, or correct responses. While recent breakthroughs involve using a ground-truth verifier of correctness, it is also possible to invoke the language model itself or an otherwise learned model as verifiers. These latter protocols raise the possibility of self-improvement, whereby the AI system evaluates and refines its own generations to achieve higher performance. This talk presents new understanding of and new algorithms for language model self-improvement. The first part of the talk focuses on a new perspective on self-improvement that we refer to as sharpening, whereby we "sharpen" the model toward one placing large probability mass on high-quality sequence, as measured by the language model itself. We show how the sharpening process can be done purely at inference time or amortized into the model via post-training, thereby avoiding expensive inference-time computation. In the second part of the talk, we consider the more general setting of a learned reward model, show that the performance of naive-but-widely-used inference-time compute strategies does not improve monotonically with compute, and develop a new compute-monotone algorithm with optimal statistical performance. Based on joint works with Audrey Huang, Dhruv Rohatgi, Adam Block, Qinghua Liu, Jordan T. Ash, Cyril Zhang, Max Simchowitz, Dylan J. Foster and Nan Jiang. We hope to see you there! ____________________________________________________________________________ (Useful links: [ https://princeton-alg-ml.github.io/ | alg-ml-website ] , [ https://calendar.google.com/calendar/u/1?cid=Y185ZWQxMzVmOGMxN2JjZmNhYjAyOTk... | alg-ml-calendar ] , [ https://lists.cs.princeton.edu/mailman/listinfo/alg-ml-reading-group | alg-ml-mailing-list ] ) Best, Zixuan , Gon, and Catherine _______________________________________________ alg-ml-reading-group mailing list [ mailto:alg-ml-reading-group@lists.cs.princeton.edu | alg-ml-reading-group@lists.cs.princeton.edu ] [ https://lists.cs.princeton.edu/mailman/listinfo/alg-ml-reading-group | https://lists.cs.princeton.edu/mailman/listinfo/alg-ml-reading-group ]
participants (1)
-
Emily C. Lawrence