
Evaluate LLMs with Evidently open-source
Evidently is an open-source tool to evaluate, test, and monitor ML-powered systems.
Join our Evidently Community Call for an in-depth walkthrough of the latest open-source features. You will learn how to effectively evaluate, test, and monitor LLM-powered applications, including RAGs and chatbots.
Deep dive with a code demo. We will show a code walkthrough explaining all the steps needed to design and run evaluations for a sample LLM system. We will cover several evaluation methods, including:
LLM as a judge
Model-based evaluations
Semantic similarity
Regular expressions
Drift detection
You will learn how to generate visual Reports summarizing evaluation results, run Test Suites for regression testing, and get a live Monitoring Dashboard to track evaluation results in time.
We will explain how you can customize such evaluations to your use - to detect toxicity, bias, leakage of personal data, or other properties that define AI quality for your use case. We will also discuss the pros and cons of different methods and answer your questions.
