Transformer Architecture from Scratch

Name: Transformer Architecture from Scratch
Start: 2025-03-15T14:00:00.000+04:00
End: 2025-03-15T16:00:00.000+04:00
Location: Ilia State University, T Building / ილიას სახელმწიფო უნივერსიტეტი, T კორპუსი

Hosted by Tbilisi AI Meetups

Ilia State University, T Building / ილიას სახელმწიფო უნივერსიტეტი, T კორპუსი

T'bilisi, Tbilisi

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

In this talk, we will explore the groundbreaking paper "Attention Is All You Need" by Vaswani et al., which introduced the Transformer architecture—revolutionizing deep learning by replacing recurrence with self-attention. We will break down the key components of the Transformer, understand why self-attention enables parallelization and captures long-range dependencies, and discuss its impact across NLP and vision tasks. To gain a hands-on understanding, we will implement the Transformer Encoder-Decoder architecture from scratch using PyTorch.

Additionally, we will explore how multi-head attention is interpreted using Anthropic's "0-layer theory" and "1-layer theory", which provide insights into how attention heads behave in shallow models and early Transformer layers.

Agenda:
1. Understanding the Transformer Architecture Key innovations in "Attention Is All You Need" Comparison with previous sequence models (RNNs, LSTMs) Role of self-attention and positional encoding

2. Breaking Down the Transformer Components
Multi-Head Self-Attention Mechanism and purpose of multiple attention heads Interpretation of multi-head attention through: 0-layer theory: 0-layer Transformers approximate bigram statistics 1-layer theory: How single-layer Transformers lead to simplistic in context learning Feedforward layers and layer normalisation Positional encoding and residual connections

3. Implementing the Transformer from Scratch Writing multi-head self-attention in PyTorch Building the encoder and decoder blocks

4. Discussion & Future Directions

Location

Ilia State University, T Building / ილიას სახელმწიფო უნივერსიტეტი, T კორპუსი

1 Giorgi Tsereteli St, T'bilisi, Georgia

Auditorium #102

Hosted By

50 Went

AI