Simran Kaur will present her General Exam "Skill-Mix: A Flexible And Expandable Family Of Evaluations For Ai Models" on Wednesday, May 15, 2024 at 2:00 PM in CS 105 and via zoom.
Zoom link: https://princeton.zoom.us/my/skaur
Committee Members: Sanjeev Arora (advisor), Elad Hazan, Danqi Chen
Abstract:
Existing LLM evaluations inadequately assess the originality of a model’s text productions and are susceptible to training-set contamination. Motivated by recent work (Arora & Goyal, 2023) that gives a mathematical model for skill emergence via LLM scaling, we present (a) Skill-Mix, a flexible and expandable family of LLM evaluations that tests for a form of compositional generalization; (b) probability calculations to evaluate whether LLMs surpass "stochastic parrot" behavior, i.e., whether an LLM can produce novel text pieces that were not encountered in the training corpus; and, (c) an investigation into synthetic data generation via Skill-Mix, which can be a more efficient alternative to raw human data for enhancing model capabilities.
Reading List:
https://docs.google.com/document/d/1N9ic4kFkVC07Fkrz_Vh5e774swk5Kq9FaEFMAJ6OUqc/edit
Everyone is invited to attend the talk, and those faculty wishing to remain for the oral exam following are welcome to do so.