Trust, Safety and AI: Fireside Chat with Anthropic Researchers & Intrinsic

Hosted by Karine Mellata

San Francisco, California

Past Event

Please click on the button below to join the waitlist. You will be notified if additional spots become available.

You will be asked to verify token ownership with your wallet.

About Event

Join us for an evening of engaging conversations around Trust, Safety, and AI on July 23, 2024, from 5:30 PM to 8:00 PM at the beautiful outdoor patio of the 645 Ventures offices.

Fireside Chat

Join Anthropic AI Safety researchers Trenton Bricken and Cem Anil during a fireside chat, where we will dive into the inner workings of large language models and potential impact of these technologies on Trust & Safety.

Trenton Bricken is part of the Mechanistic Interpretability team seeking to understand and interpret the “features” that large language models learn. The most recent display of this mechanistic understanding was “Golden Gate Claude” where a Claude was made to believe that it was the Golden Gate Bridge instead of an AI Assistant.

Cem Anil is on the Alignment Science team. His work includes “influence functions” that reveal what training data caused the AI to do something. He has also pioneered Multi-Shot Jailbreaks, an effective approach to evade the safety guardrails of LLMs by leveraging their long context windows.

This event will be moderated by Karine Mellata from Intrinsic and will be conducted under Chatham House rule to encourage open and honest discussion.

Follow Trenton Bricken, and Cem Anil on Twitter!

Event

The fireside chat will take place at 6:15PM PST for 30 minutes. There will be time before and after to mingle! Enjoy catered sushi along with a selection of alcoholic and non-alcoholic beverages.

Hosted by Intrinsic and 645 Ventures, this event will take place on the sidelines of the TrustCon conference.

Location

Please register to see the exact location of this event.

San Francisco, California

Hosted By

Trust, Safety and AI: Fireside Chat with Anthropic Researchers & Intrinsic

​Fireside Chat

​Event

Fireside Chat

Event