

AI Safety Research Exchange
Join fellow AI safety researchers for a collaborative gathering focused on sharing current work and exploring new opportunities.
At this Research Exchange, participants will deliver brief lightning talks highlighting their ongoing projects, research directions they’re eager to explore, and areas where they’re seeking collaboration.
Whether you’re looking to find research collaborators, get feedback on early-stage ideas, or simply stay informed about developments across the AI safety landscape, this Research Exchange offers a focused environment to build connections that advance this important field.
If you are interested to present, please indicate in the response form and we will reach out to you.
Lightning talks:
Confidence elicitation: A new attack vector for LLMs
Reasoning LLMs are Wandering Solution Explorers
Bullying the Machine: How Personas Increase LLM Vulnerability
Mechanistic Interpretability - Interpreting refusal in LLMs