Cover Image for Build Your Own GPT-2
Cover Image for Build Your Own GPT-2
Avatar for Future: UK
Presented by
Future: UK
25 Went
Get Tickets
Past Event
Welcome! Please choose your desired ticket type:
About Event

A hands-on course to build GPT-2 from scratch!

This is an all-day workshop - in order to get everything into one day, it's going to be intense.

This workshop is for people who:

  • Are very comfortable writing and debugging Python.

  • Have some familiarity with PyTorch or Numpy (we will be using PyTorch).

  • Some ML background is nice, but is not required.

    • It's sufficient to know that a neural-network is approximately "a bunch of matrix multiplications with non-linear activation functions in between".

  • Have ~8 hours to spare before the workshop on the prerequisites (may be less, depending on background).

If this isn't you, you'll likely struggle and not enjoy the day.

Learning Objectives

  • Understand what a transformer is, and how it is used

  • Learn what tokenization is on a high level

  • Understand the causal attention mechanism in transformers, and how to construct it by hand.

  • Understand what logits are, and how to use them to derive a probability distribution over the vocabulary.

After today, you will understand what exactly happens when you interact with large language models like GPT-2. We hide nothing*, no magic!

*Due to time constraints we do not implement the tokenizer. Andrej Karpathy has videos on this if you're curious, but we will black-box this (and only this) for today. It's just a look-up table from "words" (tokens) to numbers, so it's not that interesting, and isn't core to understanding how the transformer works.


About the teacher

David Quarel is a PhD student at the Australian National University, (ANU) Canberra, Australia, working on AI safety at the London Initiative for Safe AI (LISA), as well as a teaching assistant for ARENA. David has years of experience as a teacher, developing content both for courses at the ANU and for ARENA. He recently co-authored a new textbook on Universal Artificial Intelligence.

Ticket price covers breakfast, lunch, dinner, and a contribution toward FutureHouse.uk, our venue. This is a not-for-profit event.

Before the event:

Depending on your background, you may be able to skip bits and pieces. We estimate there would be about ~8 hours of content if you were to do all of it. Prioritise the Einops exercises, as you will not be able to build GPT-2 without it.

Agenda

  • 0930 - Doors Open & Breakfast

  • 1000 - Starting bang on - please don't be late

  • 1230 - Lunch

  • 1830 - Dinner

Join the WhatsApp group to ask questions about the event and pre-req material.

Slides for tomorrow: https://docs.google.com/presentation/d/1umuZLA4ZunbfLMfJTyGYb65Fr824PcFtsH41GtsKVGY/edit?usp=sharing

Material for tomorrow: https://arena3-chapter1-transformer-interp.streamlit.app/[1.1]_Transformer_from_Scratch

Location
Future House
Unit 4 Prout Rd, London E5 9NP, UK
Avatar for Future: UK
Presented by
Future: UK
25 Went