Cover Image for AI Evaluations - Research Sprint
Cover Image for AI Evaluations - Research Sprint
49 Going

AI Evaluations - Research Sprint

Hosted by Justin Olive & 5 others
Register to See Address
London, England
Registration
Past Event
Welcome! To join the event, please register below.
About Event

This is an event by Arcadia Impact in support of the UK AI Safety Institute's "Bounty programme for novel evaluations and agent scaffolding"

Join the London AI community for this rare opportunity to design evaluation tasks that will be used in pre-deployment testing of the world's most advanced AI models.

What's on offer

🤝 Connections in the field of AI safety

🚀 Evals workshops from Michael Schmatz of AISI & Marius Hobbhahn of Apollo Research

💰 Total prize pool of £3400

What you need to know

During this event, the aim is to produce an evaluation design and submit it to the AI Safety Institute (AISI) Bounty applications portal.

Designs will be assessed by a panel of judges, and winners will be announced on the night. Note: this judging decision is for distributing rewards and feedback during the event; it is separate from the internal decision that AISI assessors will make as part of the bounty programme.

Attendance details

  • Attendance is capped at 60 places

  • Open to non-technical participants

  • Attendance is highly encouraged for individuals with expertise in:

    • AI agent scaffolding

    • LLM training

    • National security / critical infrastructure

    • Chip design and manufacturing

    • Digital forensics

    • Security engineering

    • Fintech (B2C)

  • Participants are encouraged to form teams of 2-3 during the event, though independent contributions are also welcome.

Schedule

  • 9:30 - 10:30: Arrival and Welcome

  • 10:30 - 12:30: Workshops:

    • Michael Schmatz of AISI on AISI's Evaluations Strategy

    • Marius Hobbhahn, CEO / Co-Founder of Apollo Research on Best Practices in Evaluation Design

  • 12:30 - 1:30: Lunch (provided)

  • 1:30 - 6:00: Research Sprint

  • 6:00 - 7:30: Dinner (provided) & Judging

  • 7:30 - 8:30: Winners Announcement & Expert Feedback

Focus Areas

Teams will work on designing evaluations for frontier AI systems across key risk domains, including:

  • Development and Assurance Compromise

  • Autonomous Replication & Adaptation of AI Systems

  • Dangerous Resource Acquisition

Further details are available in the official AISI documentation: Request for Evaluations and Agent Scaffolding: Extreme Risks from Frontier AI

Preparation

  • Review AISI's guidance in the programme website and documentation

  • Consider potential evaluation ideas in advance

  • No need to form teams beforehand

Looking for additional ways to prepare?

Requirements

  • Foundational understanding of frontier AI systems (review this DSIT paper if you are unsure)

  • Interest in AI safety and governance

  • Commitment to the full day of activities

Note: by registering for this event, you agree to the Arcadia Impact Privacy Policy

Location
Please register to see the exact location of this event.
London, England
49 Going