Zeyu Shen will present his General Exam "Towards Reliable and Aligned AI Systems" on Tuesday, May 12, 2026 at 2:00 PM in CS 301.

Committee Members: Aleksandra Korolova (advisor), Peter Henderson, Prateek Mittal

Abstract:

In this qualification exam, I examine how to build AI systems that remain reliable, safe, and aligned under adversarial pressure. It connects foundational work on adversarial robustness with recent evidence that fine-tuning and reward optimization can induce misalignment in language models, and studies retrieval-augmented generation as a systems setting where safety depends not only on the model itself but also on the trustworthiness of retrieved information. My main presentation will focus on ReliabilityRAG, which proposes a reliability-aware, graph-theoretic defense that uses contradiction detection, maximum independent set selection, and weighted sampling to provide both provable robustness and strong empirical protection against retrieval corruption and prompt injection.

Reading List:

https://docs.google.com/document/d/1cfk_AQjsqQruH3n_AO_-bVOg92UnISnlowJd0jZRvvk/edit?usp=sharing

Everyone is invited to attend the talk, and those faculty wishing to remain for the oral exam following are welcome to do so.