ThursTalks with Mistral AI: Evaluations
LLMs as RAG Judges: The Next Frontier in Evaluation
As Large Language Models (LLMs) become integral to AI applications, evaluating their performance, especially in Retrieval-Augmented Generation (RAG) systems, is crucial. Traditional metrics often miss the nuanced quality of RAG outputs, particularly regarding faithfulness and relevance. This talk introduces the innovative approach of "LLM as a Judge," where one LLM evaluates another, providing a scalable, efficient solution that complements human evaluation. We'll also delve into the RAG Triad framework – focusing on Context Relevance, Groundedness, and Answer Relevance – ensuring LLM responses are accurate, contextually sound, and truly useful. Discover how this inventive approach can accelerate your RAG development cycles and lead to more robust, trustworthy AI applications.
More About the Speakers:
William El Sayed is a specialist in public sector go-to-market strategy and regulated industries at Mistral AI, a leading European startup focused on building advanced large language models. Trained as an economist and political scientist, William has a strong background in public policy, having worked with various French government ministries on tech, innovation, and AI strategies, including contributing to France’s national AI roadmap under the France 2030 initiative. At Mistral AI, William initially served as Founder’s Associate, acting as chief of staff to CEO Arthur Mensch, where he was involved in business development, public affairs, and operational leadership. He now leads public sector and regulated industry engagements, helping governments and institutions leverage generative AI to improve innovation, efficiency, and service delivery.
Loh Jia Sin, AI Solution Architect, APAC at Mistral AI. Passionate about leveraging the latest AI innovations to solve critical business challenges, Jia Sin brings a deep understanding of enterprise technology to his role at Mistral AI. His experience as a Solutions Consultant at Google Cloud, guiding organizations through successful cloud transformations, combined with a strong background in platform engineering at GIC and service delivery leadership at Seviora Holdings, equips him with a comprehensive perspective on the technological landscape and the potential of AI for businesses in this region.
--------------------------------------------
Registration will begin at 2.30pm, with sharing beginning at 3pm.
About Lorong AI
Lorong AI is a co-working hub where AI practitioners connect, share knowledge, and grow through curated programming and a collaborative environment. Home to programmes like AI Wednesdays, AI ToolsDays, and Fri-DIYs, Lorong AI provides hands-on workshops, technical deep dives, and opportunities to tackle real-world challenges in AI