DeepEval

DeepEval is an open-source Python framework for evaluating LLM applications as unit tests. It ships with research-backed metrics including GEval, AnswerRelevancyMetric, FaithfulnessMetric, TaskCompletionMetric, and ConversationalGEval, and supports end-to-end and component-level testing, multi-turn conversations, and LLM tracing for agents.

DeepEval is one of 3 APIs that Confident AI publishes on the APIs.io network.

Tagged areas include Open Source, LLM Evaluation, Python, and Testing Framework. The published artifact set on APIs.io includes a getting-started guide, API documentation, and SDKs.

API entry from apis.yml

apis.yml Raw ↑
aid: confident-ai:deepeval
name: DeepEval
tags:
- Open Source
- LLM Evaluation
- Python
- Testing Framework
humanURL: https://deepeval.com/
properties:
- url: https://deepeval.com/docs/getting-started
  type: GettingStarted
- url: https://deepeval.com/docs/
  type: Documentation
- url: https://github.com/confident-ai/deepeval
  type: SourceCode
- url: https://pypi.org/project/deepeval/
  type: SDK
description: DeepEval is an open-source Python framework for evaluating LLM applications as unit tests.
  It ships with research-backed metrics including GEval, AnswerRelevancyMetric, FaithfulnessMetric, TaskCompletionMetric,
  and ConversationalGEval, and supports end-to-end and component-level testing, multi-turn conversations,
  and LLM tracing for agents.