Arxiv Dives with Oxen.AI - Samba (πwith Author Guest: Liliang Renπ)
ββHey Nerd, join the Herd!... for a little book/paper review.
βSPECIAL GUEST: LILIANG REN, AUTHOR
βWe are STOKED to have the the lead paper author, Liliang Ren (Senior Researcher at Microsoft GenAI) join our discussion. So come prepared and with questions! And let's give an extra big thank you to Liliang.
ββWHAT TO EXPECT
ββEach week we pick a topic to cover in depth and have open Q/A and discussion. Reading optional π.
ββWe'll alternate in some Practical ML time - that includes 1) live, working code, 2) running models on real data, and 3) learning to apply the principles to your own work.
βTHIS WEEK
βWe will be covering... Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
βEfficiently modeling sequences with infinite context length has been a long-standing problem. Past works suffer from either the quadratic computation complexity or the limited extrapolation ability on length generalization. In this work, we present Samba, a simple hybrid architecture that layer-wise combines Mamba, a selective State Space Model (SSM), with Sliding Window Attention (SWA). Samba selectively compresses a given sequence into recurrent hidden states while still maintaining the ability to precisely recall memories with the attention mechanism.
ββJOIN THE CONVO
ββMake sure to also join our Discord here (https://discord.gg/s3tBEn7Ptg) to share recommendations for future sessions and more community discussion.
ββSEE PAST SESSIONS
ββTo see past topics head over to our blog which has show notes and links to Youtube videos.
π€ππ€ππ€ππ€ππ€ππ€ππ€ππ€ππ€ππ€π
ββWHO'S AN ARXIV DIVER
ββWe've been joined by folks from around the world including leaders from:
ββand many more...
ββSign up
ββWe'll be sharing datasets relevant to these sessions via our wicked fast and we think, pretty beautiful, Oxen.ai. To get the most out of participating in Arxiv Dives, we recommend:
ββAbout Oxen.ai: Build World-Class AI Datasets, Together. Track, iterate, collaborate on, & discover data in any format.
ββAbout Arxiv Dives
ββEach week we dive deep into a topic in machine learning or artificial intelligence. We break down the content into a digestible format and have an open discussion with the Oxen.ai team and friends. We like to nerd out, so expect to dive deep! Read more on why we host these in our Arxiv Dive Manifesto.