Cover Image for Multi-LLM Auditioning - AI Hack Lab (virtual)
Cover Image for Multi-LLM Auditioning - AI Hack Lab (virtual)
Avatar for AIFoundry.org
Presented by
AIFoundry.org
In-person and virtual community events of AIFoundry.org
Hosted By
42 Going

Multi-LLM Auditioning - AI Hack Lab (virtual)

Virtual
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Update:

To investigate how to evaluate multiple LLMs for an application, we'll be exploring two threads:


1. Demo & Hands-on with Llamagator - we'll show latest capabilities of the Llamagator project to prompt multiple LLMs. This includes a quantized version of Meta's Llama 3.1.

2. Open LLM evaluation resources page - we'll explore what resources ML engineers and developers need to select open models they might choose for their applications. We'll then evaluate Llama 3.1 on its degree of openness and document this for everyone.

===========================

General information about the event:

How do you select which Large Language Model (LLM) to use for your generative AI application?  Generalized benchmarks and scoreboards, such as the Hugging Face Open LLM Leaderboard, are a great starting point for creating your shortlist.  However, eventually, you’ll want to write specific tests for the range of prompts you’ll be using in your application.  

Most developers choose a single model and engineer their prompts to work best with that model. This leads them to choose only Open AI’s latest model or, occasionally, whatever’s the hottest model of the day. More advanced teams will fine-tune a selected model with their data but are still limited to which versions they choose to engineer prompts with.

What if there was a way to more easily evaluate several models and versions of models?  That’s what a team at AIFoundry set out to try and do.  In this AI Hack Lab, the team will demo the results of their investigation and invite you to hack on their code.  We’ll play with sending prompts to several models, such as OLMo and LLama 3, and compare the results.  We’ll break into virtual teams, solve some problems, and decide on possible extensions. This is a unique learning opportunity, and your critiques and observations are welcomed.

Your critiques and observations are welcome. Help us decide whether this open investigation should become an open-source project.

Avatar for AIFoundry.org
Presented by
AIFoundry.org
In-person and virtual community events of AIFoundry.org
Hosted By
42 Going