

Efficiency vs. Expansion: AI’s Sustainability Dilemma
The era of unconstrained AI scaling is ending. As frontier models move into production, power caps, inference latency, and rising token-level costs are exposing the limits of traditional scale-first architectures.
This salon convenes infrastructure executives, systems architects, and AI platform leads to confront a shared inflection point: how to architect for throughput, cost efficiency, and sustainability at scale.
What You’ll Explore
Energy as a Strategic Limiting Factor
With megawatt access dictating deployment timelines, energy has become the gating variable for AI growth. Hear how teams are securing capacity, optimizing for joules per token, and designing next-gen data infrastructure to stay ahead.
Architectures for Efficient Inference
Dive into techniques redefining performance per watt—from Mixture of Experts and speculative decoding to low-bit quantization and sparsity-aware compilers. Learn how the most efficient stacks are pushing past today’s throughput ceilings.
The True Cost of Intelligence
Scaling isn’t just about model parameters. We’ll unpack the full TCO of AI, including GPU fleet orchestration, silicon availability, regulatory overhead, and alignment with business KPIs.
Why Attend
Get frameworks to navigate the Inference Wall
Learn how to turn efficiency into a competitive advantage
Benchmark against the most efficient infra and model strategies in play
Connect with handpicked leaders building the next generation of scalable, sustainable AI systems
Source know-how from practitioners who've deployed techniques at scale