AI Evaluations - Research Sprint
This is an event by Arcadia Impact in support of the UK AI Safety Institute's "Bounty programme for novel evaluations and agent scaffolding"
Join the London AI community for this rare opportunity to design evaluation tasks that will be used in pre-deployment testing of the world's most advanced AI models.
What's on offer
🤝 Connections in the field of AI safety
🚀 Evals workshops from Michael Schmatz of AISI & Marius Hobbhahn of Apollo Research
💰 Total prize pool of £3400
What you need to know
During this event, the aim is to produce an evaluation design and submit it to the AI Safety Institute (AISI) Bounty applications portal.
Designs will be assessed by a panel of judges, and winners will be announced on the night. Note: this judging decision is for distributing rewards and feedback during the event; it is separate from the internal decision that AISI assessors will make as part of the bounty programme.
Attendance details
Attendance is capped at 60 places
Open to non-technical participants
Attendance is highly encouraged for individuals with expertise in:
AI agent scaffolding
LLM training
National security / critical infrastructure
Chip design and manufacturing
Digital forensics
Security engineering
Fintech (B2C)
Participants are encouraged to form teams of 2-3 during the event, though independent contributions are also welcome.
Schedule
9:30 - 10:30: Arrival and Welcome
10:30 - 12:30: Workshops:
Michael Schmatz of AISI on AISI's Evaluations Strategy
Marius Hobbhahn, CEO / Co-Founder of Apollo Research on Best Practices in Evaluation Design
12:30 - 1:30: Lunch (provided)
1:30 - 6:00: Research Sprint
6:00 - 7:30: Dinner (provided) & Judging
7:30 - 8:30: Winners Announcement & Expert Feedback
Focus Areas
Teams will work on designing evaluations for frontier AI systems across key risk domains, including:
Development and Assurance Compromise
Autonomous Replication & Adaptation of AI Systems
Dangerous Resource Acquisition
Further details are available in the official AISI documentation: Request for Evaluations and Agent Scaffolding: Extreme Risks from Frontier AI
Preparation
Review AISI's guidance in the programme website and documentation
Consider potential evaluation ideas in advance
No need to form teams beforehand
Looking for additional ways to prepare?
Requirements
Foundational understanding of frontier AI systems (review this DSIT paper if you are unsure)
Interest in AI safety and governance
Commitment to the full day of activities
Note: by registering for this event, you agree to the Arcadia Impact Privacy Policy