$ the open-source LLM evaluation framework

Get Started Try the DeepEval Platform

Delivered by

Confident AI

Unit-Testing for LLMs

Native integration with Pytest, that fits right in your workflow.

LLM-as-a-Judge Metrics

40+ research-backed metrics, including custom G-Eval and deterministic metrics.

Single and Multi-Turn Evals

Covering any use cases, any system architecture, including multi-modality.