Zeyu Shen will present his General Exam "Towards Reliable and Aligned AI Systems" on Tuesday, May 12, 2026 at 2:00 PM in CS 301.
Zeyu Shen will present his General Exam "Towards Reliable and Aligned AI Systems" on Tuesday, May 12, 2026 at 2:00 PM in CS 301. Committee Members: Aleksandra Korolova (advisor), Peter Henderson, Prateek Mittal Abstract: In this qualification exam, I examine how to build AI systems that remain reliable, safe, and aligned under adversarial pressure. It connects foundational work on adversarial robustness with recent evidence that fine-tuning and reward optimization can induce misalignment in language models, and studies retrieval-augmented generation as a systems setting where safety depends not only on the model itself but also on the trustworthiness of retrieved information. My main presentation will focus on ReliabilityRAG, which proposes a reliability-aware, graph-theoretic defense that uses contradiction detection, maximum independent set selection, and weighted sampling to provide both provable robustness and strong empirical protection against retrieval corruption and prompt injection. Reading List: https://docs.google.com/document/d/1cfk_AQjsqQruH3n_AO_-bVOg92UnISnlowJd0jZR... Everyone is invited to attend the talk, and those faculty wishing to remain for the oral exam following are welcome to do so.
participants (1)
-
CS Grad Department