Cover Image for πŸ¦„ ai that works: Eval-ing multiple models for each prompt
Cover Image for πŸ¦„ ai that works: Eval-ing multiple models for each prompt
Avatar for Boundary
Presented by
Boundary
We make BAML, a programming language for using LLMs. Some event recordings are available here: https://github.com/hellovai/ai-that-works
Hosted By

πŸ¦„ ai that works: Eval-ing multiple models for each prompt

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

β€‹πŸ¦„ ai that works

​A weekly conversation about how we can all get the most juice out of todays models with @hellovai & @dexhorthy

​https://www.github.com/hellovai/ai-that-works

​

​

​

​AI That Works #16 will be a super-practical deep dive into real-world examples and techniques for evaluating a single prompt against multiple models. While this is a commonly heralded use case for Evals, e.g. "how do we know if the new model is better" / "how do we know if the new model breaks anything", there's not a ton of practical examples out there for real-world use cases.

​

​On this episode we'll do a ton of hands-on live coding to look at different ways to slice and dice your prompt library to test and evolve it while understanding performance with different models.

​Pre-reading

​To prevent repeating the basics, we recommend you come in having already understanding some of the tooling we will be using:

  • ​Discord

  • ​Cursor (A vscode replacement)

  • ​Programming languages

    • ​Application Logic: Python or Typescript or Go

    • ​Prompting: BAML (recommend video)

​Meet the Speaker πŸ§‘β€πŸ’»

​​​Meet Vaibhav Gupta, one of the creators of BAML and YC alum. He spent 10 years in AI performance optimization at places like Google, Microsoft, and D. E. Shaw. He loves diving deep and chatting about anything related to Gen AI and Computer Vision!Β 

Meet Dex Horothy, founder at Human Layer - a YC company. He spent 10+ years building devops tools at Replicated, Sprout Social and JPL. DevOps junkie turned AI Engineer.

Avatar for Boundary
Presented by
Boundary
We make BAML, a programming language for using LLMs. Some event recordings are available here: https://github.com/hellovai/ai-that-works
Hosted By