

Valyu x Common Crawl x UCL: AI Agents, Crawling and the Future of the Web
The web was built to share knowledge and fuel open research. Now, AI agents are its fastest-growing users. As they crawl, retrieve, and learn from everything online, how do we balance open data, access, and control?
What We’ll Cover:
• AI Agents & the Open Web – How AI-driven retrieval is changing content access.
• Robots.txt for AI? – A new proposal to update web crawling rules for AI agents.
• AI Preference Signaling – How to communicate content use policies in an AI-first web.
• Opt-In/Opt-Out – Consent models and these evolving standards
We’ll be joined by Thom Vaughan (Common Crawl), who will introduce AI Preferences (AIPREF), a proposal designed to give publishers more control over AI access to their content.
The web is evolving. AI is shaping how knowledge is retrieved, archived, and used. The question is—are we building the right foundations for what comes next?
---
Photo by Google DeepMind from Pexels: https://www.pexels.com/photo/geometric-graphic-design-25626508/