

Reinforcement Learning and Universal Rewards
Reinforcement learning has long been recognized as a powerful way to make AI agents more reliable. But outside of frontier labs, its adoption has been limited by a critical bottleneck: the need for bespoke, high-quality reward functions — often requiring labeled datasets, complex engineering, and deep domain expertise.
RULER (Relative Universal LLM-Elicited Rewards) changes that.
This new, general-purpose reward function allows developers to apply RL to almost any agent — without labeled data, hand-crafted reward functions, or human feedback. Benchmarks show RULER-trained models not only match, but often outperform, those trained with task-specific rewards — across diverse, real-world applications.
Join Kyle Corbitt, CEO of OpenPipe, for an exclusive first look at the research, the results, and the implications:
Why RL adoption has been stuck — and how RULER changes the equation
The two key insights that make RULER more effective than traditional LLM-as-judge or hand-crafted rewards
Real-world results from customer support agents to edge-deployed reasoning systems
Where this is headed next — including potential for runtime adaptation and continuous online learning
Be among the first to understand — and start using — a framework that could redefine how we train reliable AI agents.
The AI Collective is a global non-profit community uniting 70,000+ pioneers – founders, researchers, operators, and investors – exploring the frontier of AI in major tech hubs worldwide. Through events, workshops, and community-led research, we empower the AI ecosystem to collaboratively steer AI’s future toward trust, openness, and human flourishing.
All attendees and organizers at events affiliated with The AI Collective are subject to our Code of Conduct.