Fine-Tuning Mistral 7B with Mistral-finetune
With the advent of tools from Hugging Face like the Parameter Efficient Fine-Tuning (PEFT) library, it is easy to customize LLMs for downstream tasks. PEFT methods fine-tune fewer parameters while freezing most of the neural network weights that define an off-the-shelf LLM.
The first method released for fine-tuning via PEFT, and the most popular method used in the industry today, is Low-Rank Adaptation of Large Language Models (LoRA), which combines clever matrix algebra with a singular focus on training only the Transformer attention blocks. Effectively, 1-2% of weights - in the form of low-rank matrix perturbations - are trained.
Recently, Mistral released their official repo and guide on fine-tuning Mistral models with LoRA.
In this event, we check it out!
Previously, we’ve covered fine-tuning of Mistral models and using LoRA:
Given this context, we’ll look at Mistral’s new library, and compare it to other tools and techniques for fine-tuning LLMs. We will also discuss some of the differences to keep in mind between smaller models built upon a classic decoder-only transformer architecture (e.g., Mistral 7B) with beefier Mixture-of-Experts models (e.g., Mixtral8x7B).
We’ll take the latest and greatest Mistral 7B Instruct v3 model off the shelf to assess the library and use it to find out “Does it do what it says on the tin?”
“
Note
The goal of this repository is to provide a simple, guided entry point to finetune Mistral models. As such, it is fairly opinionated (especially around data formatting) and does not aim at being exhaustive across multiple model architecture or hardware types. For more generic approaches, you can check out some other great projects like torchtune.
“
Join us live to learn:
How to fine-tune Mistral AI models with Mistral-finetune.
How Mistral-finetune compares to Hugging Face’s PEFT-QLoRA methods.
How to build Mistral-finetune into your LLM application workflow.
Speakers:
Dr. Greg” Loughnane is the Co-Founder & CEO of AI Makerspace, where he is an instructor for their AI Engineering Bootcamp. Since 2021 he has built and led industry-leading Machine Learning education programs. Previously, he worked as an AI product manager, a university professor teaching AI, an AI consultant and startup advisor, and an ML researcher. He loves trail running and is based in Dayton, Ohio.
Chris “The Wiz” Alexiuk is the Co-Founder & CTO at AI Makerspace, where he is an instructor for their AI Engineering Bootcamp. During the day, he is also a Developer Advocate at NVIDIA. Previously, he was a Founding Machine Learning Engineer, Data Scientist, and ML curriculum developer and instructor. He’s a YouTube content creator YouTube who’s motto is “Build, build, build!” He loves Dungeons & Dragons and is based in Toronto, Canada.
Follow AI Makerspace on LinkedIn and YouTube to stay updated about workshops, new courses, and corporate training opportunities.