Attention Alternatives & Scaling Context Windows (w/ Innovation Endeavors & Cerebral Valley)
Recently, there has been substantial interest in ways to further scale LLM context windows by getting around the quadratic scaling of attention, such as Mamba, Striped Hyena, State Space Models, and Monarch Mixer.
Come join us for an evening where we will discuss this emerging line of research.
We will kick things off with presentations by some of the leading researchers exploring attention alternatives and sub-quadratic scaling of attention windows, and then we'll have an open happy hour.
Presentations by:
Karan Goel (CEO of Cartesia AI) - The history and future of State Space Models
Jacob Buckman (Manifest AI) - Drawbacks & Opportunities for Linear Transformers
Bailin Wang (MIT) - Gated Linear Attention Transformers with Hardware-Efficient Training
Michael Poli - StripedHyena
Eugene Cheah - RWKV
Hosted by Innovation Endeavors and Cerebral Valley