AI Meetup Berlin: MoE inference economics from first priciples

Name: AI Meetup Berlin: MoE inference economics from first priciples
Start: 2025-08-20T18:30:00.000+02:00
End: 2025-08-20T21:00:00.000+02:00
Location: Aleph Alpha Berlin

Aleph Alpha AI Meetup

Aleph Alpha Berlin

Berlin, Berlin

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Piotr (Senior AI Inference Engineer at Aleph Alpha, https://x.com/tugot17) will talk about MoE inference economics from first priciples.

The release of Kimi K2 mixture-of-expert (MoE) models has firmly established them as the leading architecture of large language models (LLMs) at the intelligence frontier. Due to their massive size (+1 trillion parameters) and sparse computation pattern, selectively activating parameter subsets rather than the entire model for each token, MoE-style LLMs present significant challenges for inference workloads, significantly altering the underlying inference economics. With the ever-growing consumer demand for AI models, as well as the internal need of AGI companies to generate trillions of tokens of synthetic data, the "cost per token" is becoming an even more important factor, determining the profit margins and the cost of capex required for internal reinforcment learning (RL) training rollouts.
In this talk we will go through the details of the cost structure of generating a "DeepSeek token," we will discuss the tradeoffs between latency/throughput and cost, and we will try to estimate the optimal setup to run it.

Location

Aleph Alpha Berlin

Ritterstraße 6, 10969 Berlin, Germany

Ground floor

Presented by

Aleph Alpha AI Meetup

Hosted By

10 Going

IA