


Paris first inference & vLLM meet-up
Join us for the first inference & vLLM community meet-up in Paris, bringing together AI practitioners, infrastructure experts, and companies using vLLM in production.
Whether you're experimenting with vLLM or running large-scale inference workloads, this event is for you. Expect hands-on insights, real-world feedback, and open discussions with others working on optimizing inference at scale.
📍 Location: Paris, France
🌎 Language: English
🕖 Time: 7:00PM – 9:00PM
💬 Format: In-person
Agenda:
7:00 – 7:30 PM: Welcome
7:30 – 8:15 PM: Talks
Exxa – Etienne Balit (CTO): Intro to vLLM
AMD – Félix Marty (ML/SW): Quantization on AMD Instinct
Scaleway – Grégoire de Turckheim (Engineering Manager): Deploying vLLM at scale (5,000+ GPUs)
8:15 – 9:00 PM: Open networking & drinks + pizzas
We’ll discuss performance optimizations, scaling strategies, hardware compatibility, and more.
🎯 Who should come?
ML engineers, infra & DevOps teams, AI founders, and anyone using or evaluating vLLM in their stack.
🎟️ Free registration – spots are limited (first edition)