LangSmith Evaluation

LangSmith Evaluation is LangChain's evaluation framework for measuring application quality across the lifecycle. The docs describe evals as "a way to breakdown what 'good' looks like and measure it." It supports code evaluators (deterministic rules), LLM-as-judge evaluators (reference-based or reference-free), and heuristic checks (length, latency, keywords). Concepts include datasets and examples, experiments, and pairwise evaluation for relative comparisons.

API entry from apis.yml

apis.yml Raw ↑
name: LangSmith Evaluation
description: LangSmith Evaluation is LangChain's evaluation framework for measuring application quality
  across the lifecycle. The docs describe evals as "a way to breakdown what 'good' looks like and measure
  it." It supports code evaluators (deterministic rules), LLM-as-judge evaluators (reference-based or
  reference-free), and heuristic checks (length, latency, keywords). Concepts include datasets and examples,
  experiments, and pairwise evaluation for relative comparisons.
humanURL: https://docs.langchain.com/langsmith/evaluation-concepts
baseURL: https://api.smith.langchain.com
tags:
- LangChain
- LLM as a Judge
- Pairwise
- Reference-Free
- Online and Offline
properties:
- type: Documentation
  url: https://docs.langchain.com/langsmith/evaluation-concepts
- type: Portal
  url: https://smith.langchain.com
- type: Pricing
  url: https://www.langchain.com/pricing-langsmith