Evaluate AI with Confidence: A 60-Minute Deep Dive
As AI systems become more powerful, evaluating them effectively is key to building trust and performance. In this session, we’ll walk through Future AGI’s evaluation workflow inside the Build section — covering how to run multi-modal evaluations, set up custom metrics, and use feedback and error localization to improve your models.
This webinar will give you a practical view of how to evaluate smarter, catch issues early, and ensure your models are performing as expected.
💡What you’ll learn
How evaluations fit into the Build workflow for AI development
What are multi-modal evaluations and how to apply them effectively
How to set up and customize evaluation metrics for different data types
How to collect user feedback and localize errors in AI outputs
Best practices to improve dataset quality and model performance through smart evaluations
About the Speakers
Nikhil Pareek, Founder & CEO of Future AGI, is a serial entrepreneur with over nine years of experience building startups and leading AI-driven innovation across industries like healthcare, IoT, consulting, and finance. He’s passionate about bringing engineering rigor to AI systems and solving core infrastructure challenges in model development and deployment. At Future AGI, he’s focused on helping teams build trustworthy, production-ready AI through automation, evaluation, and observability.
Rishav Hada, an Applied Scientist at Future AGI, specializes in AI evaluation and observability. Previously at Microsoft Research, he developed frameworks for generative AI evaluation and multilingual language technologies. His research, funded by Twitter and Meta, has been published in top conferences like EMNLP, ACL, and NAACL and integrated into AI products. His recent work on mitigating bias in language technologies won the Best Paper Award at FAccT’24.
About Future AGI
Future AGI is a San Francisco–based advanced AI Lifecycle Management platform designed to streamline experimentation, evaluation, and real-time observability. Traditional AI tools often rely on guesswork due to gaps in data generation, error analysis, and feedback loops. Future AGI eliminates this uncertainty by automating the data layer with multi-modal evaluations, agent optimizers, observability and synthetic data tools—cutting AI development time by up to 95%. By removing manual overhead, it brings software engineering rigor to AI, enabling teams to build high-performing, trustworthy systems—faster.
🌐 To know more about Future AGI, visit here!
👥 Join our growing Slack community — explore multimodal AI, solve real-world eval challenges, and connect with fellow builders.