Cover Image for Animal Benchmark Building Session
Cover Image for Animal Benchmark Building Session
12 Went
Private Event

Animal Benchmark Building Session

Hosted by AI for Animals
Register to See Address
Berkeley, California
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Just before the AI for Animals conference, join us for a 4 hour coworking session to create a new, light benchmark!

Description: Animal Harm Assessment (AHA) project gave us data: over 100,000 answers from 10 chatbots to 4,350 curated questions. The answers have been scored on whether they increase (or decrease) the risk of harm to animals. Some of the QA pairs could plausibly form the "gold standard" for a light QA benchmark (as opposed to an open ended one). Key unresolved questions to address:

  1. Which questions and which answers to choose?

  2. How to set the benchmark up technically?

  3. How to increase the use of this (and other) animal-related benchmarks?

Come if you are interested, but especially great if you have some familiarity with:

  • benchmarks, benchmark development

  • statistical methods

  • python, ideally - also with Inspect Evals framework

  • people, labs, institutions who could use and promote this and other animal benchmarks

RSVP to this event page & share it with others who could be interested!

With questions, please reach out to:
Arturs Kanepajs, AI for Animals Benchmarking Lead akanepajs@gmail.com
Constance Li, AI for Animals Founder constance@aiforanimals.org

There will be snacks and light refreshments.


Expected agenda:
12:00-12:30 - Introductions and overview of goals
12:30-14:00 - Working session on benchmark development
14:00-14:15 - Break for refreshments
14:15-15:45 - Continue working session
15:45-16:00 - Wrap-up and next steps

Some more materials - if you can, take a look before the session:

To get updates about the outcomes and next steps: join Hive Slack (www.joinhive.org), #s-llm-benchmarking channel.

Location
Please register to see the exact location of this event.
Berkeley, California
12 Went