Kaiqu Liang will present his General Exam "Towards human-centered AI safety" on Monday, May 13, 2024 at 1:00 PM in CS 105.

Committee Members: Jaime Fernández Fisac (advisor), Tom Griffiths, Benjamin Eysenbach

Abstract:

In the domain of human-AI interactions, ensuring the safety of AI systems is crucial, particularly as they are deployed in high-stakes scenarios. This extends beyond the conventional focus on aligning AI objectives with human values to include a deep understanding and respect for human needs as hard constraints within the AI's decision-making process. Our research aims to develop safety-critical human-AI systems to prevent violations of critical human needs proactively. To achieve this, my PhD research actively pursues three critical dimensions: developing safe representations, enhancing human-AI alignment, and advancing safety-aware cognition.

In my early Ph.D. research, I took the first step in safety-aware cognition, which quantifies safety risks and uncertainty in the domain of the robot task execution with a Large language models (LLMs) planner. We introduced introspective planning, a novel method enabling robots to proactively identify and seek clarification on uncertainties, effectively minimizing risk without fine-tuning. Through rigorous testing, we demonstrated that introspective planning outperforms existing LLM-based methods in both success rate and safety, showing significant promise for enhancing human-AI interactions in safety-critical settings. This approach not only addresses the direct issue of ensuring compliance and safety in robot task execution but also contributes to the broader objective of developing AI systems that prioritize human needs and safety as foundational principles.

Reading List:

https://docs.google.com/document/d/14BHiLF8wsrXU2YwErddQ6RD2LKNNIIQeajkHkkgPJNY/edit

Everyone is invited to attend the talk, and those faculty wishing to remain for the oral exam following are welcome to do so.