Cover Image for Trust, Safety and AI: Fireside Chat with Anthropic Researchers & Intrinsic
Cover Image for Trust, Safety and AI: Fireside Chat with Anthropic Researchers & Intrinsic
Private Event

Trust, Safety and AI: Fireside Chat with Anthropic Researchers & Intrinsic

Hosted by Karine Mellata
Register to See Address
San Francisco, California
Registration
Past Event
Please click on the button below to join the waitlist. You will be notified if additional spots become available.
About Event

Join us for an evening of engaging conversations around Trust, Safety, and AI on July 23, 2024, from 5:30 PM to 8:00 PM at the beautiful outdoor patio of the 645 Ventures offices.

Fireside Chat

Join Anthropic AI Safety researchers Trenton Bricken and Cem Anil during a fireside chat, where we will dive into the inner workings of large language models and potential impact of these technologies on Trust & Safety.

Trenton Bricken is part of the Mechanistic Interpretability team seeking to understand and interpret the “features” that large language models learn. The most recent display of this mechanistic understanding was “Golden Gate Claude” where a Claude was made to believe that it was the Golden Gate Bridge instead of an AI Assistant. 

Cem Anil is on the Alignment Science team. His work includes “influence functions” that reveal what training data caused the AI to do something. He has also pioneered Multi-Shot Jailbreaks, an effective approach to evade the safety guardrails of LLMs by leveraging their long context windows.

This event will be moderated by Karine Mellata from Intrinsic and will be conducted under Chatham House rule to encourage open and honest discussion.

Follow Trenton Bricken, and Cem Anil on Twitter!

Event

The fireside chat will take place at 6:15PM PST for 30 minutes. There will be time before and after to mingle! Enjoy catered sushi along with a selection of alcoholic and non-alcoholic beverages.

Hosted by Intrinsic and 645 Ventures, this event will take place on the sidelines of the TrustCon conference.

Location
Please register to see the exact location of this event.
San Francisco, California