Evaluating LLMs: Needle in a Haystack

Name: Evaluating LLMs: Needle in a Haystack
Start: 2024-02-13T17:30:00.000-08:00
End: 2024-02-13T20:30:00.000-08:00
Location: San Francisco, California

Arize AI

Register to See Address

San Francisco, California

Registration Closed

This event is not currently taking registrations. You may contact the host or subscribe to receive updates.

About Event

LLM evaluation is a discipline where confusion reigns and foundation model builders are effectively grading their own homework.

Building on the viral threads on X/Twitter, Greg Kamradt, Robert Nishihara, and Jason Lopatecki discuss highlights from Arize AI's ongoing research on how major foundation models – from OpenAI’s GPT-4 to Mistral and Anthropic’s Claude – are stacking up against each other at important tasks and emerging LLM use cases, covering and explaining the importance of results of Needle in a Haystack tests and other evals results on hallucination detection on private data, question-and-answer, code functionality, and more.

Curious which foundation models your company should be using for a specific use case – and which to avoid? You won’t want to miss this meetup!

-------

Agenda:

5:30 PM - 6:00 PM: Arrival & Networking
6:00 PM - 6:30 PM: Fine-tuning for Context Length Extension + Q&A w/ Kourosh Hakhamaneshi
6:30 PM - 7:15 PM: Evaluating LLMs: Needle in a Haystack Fireside Chat + Q&A
7:15 - 8:00: Networking & Drinks

Location

Please register to see the exact location of this event.

San Francisco, California

Presented by

Arize AI

Generative AI-focused workshops, hackathons, and more. Come build with us!

Hosted By