Cover Image for Llama-2 or OpenAI? How to compare LLMs using A/B testings

Llama-2 or OpenAI? How to compare LLMs using A/B testings

Hosted by Daliana Liu & Alexa Guerra
 
 
Zoom
Registration
Registration Closed
This event is not currently taking registrations. You may contact the host or subscribe to receive updates.
About Event

Should we use an open source model like Llama-2 or Open AI APIs? The best way to decide which one works better in production is through A/B testing. In this workshop, we'll show you how to set up an A/B testing for LLMs. We'll create an use case using Llama-2 and OpenAI APIs, and show you how to set up A/B tests and analyze results.

We'll cover:

  1. Experimentation for AI: Discover why continuous experimentation is a non-negotiable aspect of AI, and how it impacts your model choices.

  2. Demystify Llama-2: Uncover how this LLM operates and its unique characteristics.

  3. A/B Testing for LLMs Setup: How to use Eppo's feature flagging to run experiments on a LLM

  4. Case Study - Text Extraction: See Llama-2 in action, extracting unstructured text from a resume, revealing its practical application.

  5. Analyze A/B Testing Results: Learn what measures are crucial when comparing LLMs, guiding your decisions with precision.

Speakers

Daliana Liu is a senior data scientist at Predibase. Previously, she worked on A/B testing for 3 years at Amazon, and developed machine learning models for AWS customers. Daliana has 200k followers on Linkedin talking about A/B testing and machine learning.

Sven Schmit is the head of statistics engineering at Eppo - the next-gen A/B testing platform. Sven has a phD in computational mathematical engineering from Stanford.

This is a great opportunity for you to learn both LLMs and A/B testing, free to sign up.