Lucy He will present her General Exam "The Hidden Risks of Benign Data: Generative AI’s Safety and Copyright Challenges" on Tuesday, April 15, 2025 at 10:30 AM in CS 302 and via zoom.

14 Apr 2025

      Lucy He will present her General Exam "The Hidden Risks of Benign Data: Generative AI’s Safety and Copyright Challenges" on Tuesday, April 15, 2025 at 10:30 AM in CS 302 and via zoom. 

Zoom link: https://princeton.zoom.us/my/lucyhe 

Committee Members: Danqi Chen (co-advisor), Peter Henderson (co-advisor), Tom Griffiths 

Abstract: 
Large language models and text-to-image models can be vulnerable to seemingly harmless inputs, resulting in safety degradation or copyright risks. In the first part of the talk, I will discuss critical issues related to preserving the safety of open-weight models. I will describe a general approach to identify toxicity-promoting data. Using this approach, we find that superficially benign data, such as a collection of bullet lists and math, can be even more harmful than explicitly toxic text. In the second part of the talk, I will introduce our work examining copyright concerns in text-to-image models. We introduce an evaluation framework to evaluate both the model’s copyright compliance and consistency with user input. We then apply the framework to study how benign prompts trigger copyrighted character generation and the potential (in)effectiveness of current mitigation strategies. These works critically examine how benign data can compromise model safety and trigger copyright risks, shedding light on a better understanding and improvement of these systems for deployment. 

Reading List: 
https://docs.google.com/document/d/1JHuGtMwAplGjamaEB_v8RuBHUUrZRnxGeZwceUTD... 

Everyone is invited to attend the talk, and those faculty wishing to remain for the oral exam following are welcome to do so.

CS Grad Department

tags

participants (1)