Mixture-of-Recursions: The Power of Recursive Transformers
About Event
🔬 AI4Science on alphaXiv
🗓 Friday August 1st 2025 · 11AM PT
🎙 Featuring Reza Bayat
💬 Casual Talk + Open Discussion
🎥 Zoom: https://stanford.zoom.us/j/98617880120?pwd=nwDLtfmpToIaFpl3rLn4lwRmMNRWEk.1
What if language models could learn to "think harder" only when they need to—allocating deep computation to challenging tokens while breezing through simple ones?
Reza Bayat presents Mixture-of-Recursions, a breakthrough architecture that unifies parameter sharing with adaptive computation. By dynamically assigning different recursion depths to individual tokens, MoR achieves large-model quality with significantly fewer parameters and computational resources.
Whether you’re working on the frontier of LLMs or just curious about anything AI4Science, we’d love to have you there.
Hosted by: alphaXiv x Intology
AI4Science: join the community