From Prompt to Production: Smarter AI with Evaluations

Hosted by n8n, Desiree & Angel Menendez

YouTube

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

What You'll Learn

Join host, Angel Menendez, and special guest, Elvis Saravia, in exploring strategies for evaluating AI and how to implement them into your development process.

In this session, you'll learn how to:

Apply industry best practices for evaluating agentic workflows
Build evaluation datasets
Select and design evaluation metrics
Perform evaluations with n8n

One of the more challenging aspects of building production-ready AI solutions is designing them to handle unpredictable real-world inputs and tricky edge cases. Identifying and fixing these can take up the majority of development time, especially when a small prompt tweak might solve one issue but introduce new, unintended behavior elsewhere. And even without prompt changes, model outputs can drift over time, making it hard to know whether your system is improving or quietly degrading.

For development teams, this creates uncertainty and risk - especially when AI outputs are customer-facing or business-critical. That’s when incorporating evaluations into your AI development process can really help. Effective evaluations can enable data-driven decisions on whether to adjust a prompt or switch to a new model, and can help catch regressions by monitoring performance - giving you peace of mind for AI solutions in production.

Learn more in our blog, Introducing Evaluations for AI workflows.

Guest: Elvis Saravia is a co-founder of DAIR.AI, where he leads all AI research, education, and engineering efforts. Elvis holds a Ph.D. in computer science, specializing in NLP and language models. His primary interests are training and evaluating LLMs and developing scalable applications with LLMs. He co-created the Galactica LLM at Meta AI and supported and advised world-class teams like FAIR, PyTorch, and Papers with Code. Prior to this, he was an education architect at Elastic, where he developed technical curriculum and courses on solutions such as Elasticsearch, Kibana, and Logstash.

Who Should Attend?

This webinar is ideal for:

Enterprise teams looking to make data-driven decisions about models and prompt changes.
Developers and engineers seeking to iterate faster while increasing AI reliability.
n8n users at any level who are building and deploying AI solutions.

Session Format

Our live Office Hours will feature expert insights, real-time demonstrations, and interactive Q&A. Submit your AI Evaluation Related questions in advance by 📌 clicking here, and our team will walk you through real-world solutions tailored to your needs.

What to Prepare

No setup is required! Just bring your questions and get ready to learn how to scale automation effectively with n8n.

Join the n8n at SCALE Community

This session is more than just a learning opportunity—it's a chance to connect with other Enterprise and IT Ops professionals. Engage with our experts, share experiences, and become part of a growing network of automation enthusiasts.

📅 Reserve your spot now and get your questions answered live!

Hosted By

1,334 Went

AI

From Prompt to Production: Smarter AI with Evaluations

​What You'll Learn

​Who Should Attend?

​Session Format

​What to Prepare

​Join the n8n at SCALE Community

What You'll Learn

Who Should Attend?

Session Format

What to Prepare

Join the n8n at SCALE Community