CAIA Speaker Event: Evan Hubinger (Anthropic)

Name: CAIA Speaker Event: Evan Hubinger (Anthropic)
Start: 2025-05-27T14:00:00.000-07:00
End: 2025-05-27T15:00:00.000-07:00
Location: Mabel and Arnold Beckman Laboratories of Behavioral Biology

Hosted by Adarsh Muthiah Kumarappan

Mabel and Arnold Beckman Laboratories of Behavioral Biology

Pasadena, California

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Here are some details on Caltech AI Alignment’s next speaker event:

Who: Evan Hubinger (virtual), Anthropic
When: May 27th at 2-3 pm PT
Where: Watch party in BBB B180
Zoom link: https://caltech.zoom.us/j/85478635427
What: Evan Hubinger leads Alignment Stress-Testing, one of the alignment research orgs at Anthropic. Evan will present on Alignment Faking in Large Language Models, his team's research on how Claude will engage in alignment faking: selectively complying with its training objective in training to prevent modification of its behavior out of training. Other than Alignment Faking in Large Language Models, Evan's work at Anthropic includes Auditing Language Models for Hidden Objectives and Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training.

No specific technical background is required - we welcome all interested students who are eager to learn! As with all CAIA events, we will have pizza and boba!

Location

Mabel and Arnold Beckman Laboratories of Behavioral Biology

Pasadena, CA 91125, USA

Hosted By

80 Went

人工智能