Cover Image for CAIA Speaker Event: Evan Hubinger (Anthropic)
Cover Image for CAIA Speaker Event: Evan Hubinger (Anthropic)
80 Went

CAIA Speaker Event: Evan Hubinger (Anthropic)

Hosted by Adarsh Muthiah Kumarappan
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Here are some details on Caltech AI Alignment’s next speaker event:

  1. Who: Evan Hubinger (virtual), Anthropic

  2. When: May 27th at 2-3 pm PT

  3. Where: Watch party in BBB B180

  4. Zoom link: https://caltech.zoom.us/j/85478635427 

  5. What: Evan Hubinger leads Alignment Stress-Testing, one of the alignment research orgs at Anthropic. Evan will present on Alignment Faking in Large Language Models, his team's research on how Claude will engage in alignment faking: selectively complying with its training objective in training to prevent modification of its behavior out of training. Other than Alignment Faking in Large Language Models, Evan's work at Anthropic includes Auditing Language Models for Hidden Objectives and Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training.

No specific technical background is required - we welcome all interested students who are eager to learn! As with all CAIA events, we will have pizza and boba!

Location
Mabel and Arnold Beckman Laboratories of Behavioral Biology
Pasadena, CA 91125, USA
80 Went