Cover Image for LLM Evaluation Essentials

LLM Evaluation Essentials

 
 
Zoom
Registration
No Upcoming Sessions
This series has no upcoming sessions scheduled. Heard something is coming? Check back later!
About Event

Step into the world of LLM evaluations with a 3-part series dedicated to achieving production excellence. We’ll unpack advanced evaluation techniques and best practices formulated through rigorous testing — spanning retrieval, summarization, and hallucination — to help ensure production readiness. A must-attend for AI & ML engineers and data scientists.

This series will cover:

  • Binary LLM performance evaluation and its benefits

  • Golden datasets and how to use them

  • Statistical analysis of performance of GPT-4, GPT 3.5 and more

  • Best practices for LLM evals

Session 1 (10/3): Benchmarking and Analyzing Retrieval Approaches

Session 2 (10/10): Statistical Analysis of Summarization LLM Evaluations

Session 3 (10/16): Statistical Analysis of Hallucination LLM Evaluations