Princeton AI Alignment and Safety Seminar (PASS): Tuesday, March 19 at 2pm

18 Mar 2024

      Princeton AI Alignment and Safety Seminar (PASS) Catastrophic misalignment of large language models 

[ https://paulfchristiano.com/ | Paul Christiano, Alignment Research Center ] 

Tuesday, March 19 
2:00 - 3:00 pm 

[ https://bit.ly/3IHlz4h | View the livestream ] 

Abstract: I’ll discuss two possible paths by which AI systems could be so misaligned that they attempt to deceive and disempower their human operators. I’ll review the current state of evidence about these risks, what we might hope to learn over the next few years, and how we could become confident that the risk is adequately managed. 

Bio: I run the [ http://alignmentresearchcenter.org/ | Alignment Research Center ] . I previously ran the language model alignment team at [ https://openai.com/ | OpenAI ] , and before that received my PhD from the [ http://theory.cs.berkeley.edu/ | theory group ] at UC Berkeley. You may be interested in my [ https://ai-alignment.com/ | writing about alignment ] , my [ http://sideways-view.com/ | blog ] , my [ https://scholar.google.com/citations?hl=en&user=B7oP0bIAAAAJ&view_op=list_works | academic publications ] , or [ https://paulfchristiano.com/fun-and-games/ | fun and games ] . I am an advisor and board member at [ https://metr.org/ | METR ] , an external advisor to the [ https://www.gov.uk/government/publications/ai-safety-institute-overview/intr... | UK AI Safety Institute ] , and a trustee of [ https://www.anthropic.com/index/anthropics-responsible-scaling-policy | Anthropic’s
 Long-Term Benefit Trust ] . 

Stay informed and receive seminar reminders by joining our mailing list: [ https://tinyurl.com/pass-mailing | 
https://tinyurl.com/pass-mailing ] 

Organized by: [ https://pli.princeton.edu/ | 
Princeton Language and Intelligence ]

Emily C. Lawrence

tags

participants (1)