


The Eval Edition - Know before you ship
How do you know your AI actually works?
We’re talking evals - the most underhyped part of building real-world AI.
This edition of Two Tokens Friday is all about how to evaluate, test, and trust the AI you’re shipping.
What’s on the Menu
Tool Drops
Eval Playground – Build & run evals with ease. Plug in your prompts, datasets, and run side-by-side comparisons.
Structured Feedback Infra – What users feel vs. what models do. We’re bridging the gap.
Headliner
A product leader from one of India’s top AI startups joins us to share how their team runs evals in production - from model changes to feature launches.
Hands-On Fun
Red Team Roulette – Spot model failures and earn swag.
Build Your Own Eval Run – Try out our tools live and take your evals beyond spreadsheets.
Snacks & Sharp Talk
No filler. Just honest conversations about what works, what fails, and how teams are keeping AI grounded in reality.
Who It’s For
Founders & PMs pushing AI to prod
Engineers building evaluation infra
Researchers tired of toy benchmarks
Anyone who wants to move beyond "looks good"
Bottom Line
If you’ve ever asked “Is this model good enough?” this event is for you.
Walk in curious. Walk out with a playbook.