Cover Image for Current challenges in the alignment techniques of foundation models
Cover Image for Current challenges in the alignment techniques of foundation models
29 Going

Current challenges in the alignment techniques of foundation models

Hosted by Ines Altemir Marinas & Agatha Duzan
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Are current AI safety measures enough? Join us to examine their effectiveness against future threats!


Program:

  • 19h00: Beginning of the talk by CeSIA

  • 19h30: Q&A

Location: EPFL, room CM 1 221

The talk will be remote, and will be followed by a live Q&A and an apéro, with plenty of opportunities to ask questions to the speaker!


What to expect:

  • The introduction to a framework to examine the progression of AI development, focusing on the growth in agency and generality among AI models. This trend implies that future iterations may exhibit novel types of malfunctions not seen in present-day models.

  • The overview of existing technical safety measures for AI: how they mitigate current failure modes, and their potential to address future issues.

  • The presentation of safety as a characteristic of the socio-technical system in which technical development takes place, discussing defense in depth strategies, organizational safety culture, and the role of third-party auditors.

  • The introduction of BELLS: a practical assessment tool for evaluating the resilience of large language model supervision systems. This is our main technical project, that we presented at the ICML conference.

Speaker: 

Charbel-Raphaël Ségerie is the executive director of the Centre pour la Sécurité de l’IA (CeSIA) where he leads research and training in advanced AI. He teaches a course on AI safety at the Ecole Normale Supérieure. His work focuses on comprehensively characterising emerging risks in AI, on interpretability, addressing challenges related to current safety methods like RLHF, and safe-by-design AI approaches. Previously, he researched at Inria Parietal and Neurospin.


Organized by:

​Safe AI Lausanne, an EPFL student association and commission of EA Lausanne.

Check out our website or join our Telegram if AI Safety interests you!

Location
EPFL
1015 Lausanne, Switzerland
29 Going