Cover Image for Mixture-of-Recursions: The Power of Recursive Transformers
Cover Image for Mixture-of-Recursions: The Power of Recursive Transformers
Hosted By
66 Going

Mixture-of-Recursions: The Power of Recursive Transformers

Hosted by alphaXiv
Zoom
Registration
Welcome! To join the event, please register below.
About Event

About Event

​​​​​🔬 AI4Science on alphaXiv
🗓 Friday August 1st 2025 · 11AM PT
🎙 Featuring Reza Bayat
💬 Casual Talk + Open Discussion

​​🎥 Zoom: https://stanford.zoom.us/j/98617880120?pwd=nwDLtfmpToIaFpl3rLn4lwRmMNRWEk.1


What if language models could learn to "think harder" only when they need to—allocating deep computation to challenging tokens while breezing through simple ones?

Reza Bayat presents Mixture-of-Recursions, a breakthrough architecture that unifies parameter sharing with adaptive computation. By dynamically assigning different recursion depths to individual tokens, MoR achieves large-model quality with significantly fewer parameters and computational resources.


​​​​​Whether you’re working on the frontier of LLMs or just curious about anything AI4Science, we’d love to have you there.

​​​​​Hosted by: alphaXiv x Intology 

​​​​​AI4Science: join the community

Hosted By
66 Going