Cover Image for Matryoshka Principles for Adaptive Intelligence
Cover Image for Matryoshka Principles for Adaptive Intelligence
Hosted By
252 Went

Matryoshka Principles for Adaptive Intelligence

Hosted by Together AI
YouTube
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Speaker: Aditya Kuspati, Staff Research Scientist - Google DeepMind

The increasing scale of deep learning models presents significant challenges for deployment across diverse computational environments, each with unique constraints on latency, memory, and energy. Traditional approaches often necessitate training and maintaining separate models for each desired operating point, leading to substantial overhead.

This talk explores the "Matryoshka" principle, a promising paradigm for achieving computational adaptivity within a single trained artifact. Inspired by Russian nesting dolls, Matryoshka methods embed coarser, computationally cheaper structures within finer, more powerful ones, enabling dynamic adjustment of resource usage at inference time.

This technique is highly generalizable across various fundamental components of Machine Learning like Embeddings, Transformers and even the integer data type for Quantization. The community extended it beyond just these components and has seen a wide array of deployments both across industry and open-source, serving over a Billion users daily.

Collectively, these works demonstrate how the Matryoshka principle facilitates unified training of highly flexible models that can seamlessly adapt their computational footprint post-training, significantly simplifying deployment and enhancing efficiency across heterogeneous hardware.

Hosted By
252 Went