Google DeepMind’s Griffin architecture: A challenger to the Transformer?
In the last few years, transformers have become the default architecture for sequential modeling tasks like language modeling. However, a new family of models, such as state space models, is trying to challenge the status quo.
In this talk, a Google DeepMind research scientist, Aleksandar Botev, will investigate the recent progress on these classes of models and provide context and different perspectives on them from both theoretical and practical points of view.
He will argue that not only the choice of the recurrent layer matters, but the whole block design and architecture also play a huge role in their success.
With this, Aleksandar will present to the BuzzRobot community Griffin—a Gated Linear Recurrent Unit and Local Attention system that achieves state-of-the-art performance similar to Transformers but is significantly faster at inference, both in latency and throughput.
He will also show that these models can leverage much longer contexts than those on which they were trained and will discuss the interesting implications of this.