https://www.zurich-ai-alignment.com/
ZAIA

Zurich AI Alignment

Samuel Šimko will share his recent work on jailbreaking LLMs and why it matters for AI-safety research.

The speaker will walk you through his contest-winning manual jailbreaking techniques, explore the current landscape

of automated attacks and defenses, and conclude with a brief presentation of his master's thesis, which proposes a novel

defense against adversarial attacks on LLMs.

Expect an accessible session aimed at students and practitioners curious about Large Language Model security.

Escaping Alignment: LLM Jailbreaks and their defenses