Evals

Inspect AI

Inspect AI is an open-source framework for large language model evaluations developed and maintained by the UK AI Security Institute (UK AISI) and Meridian Labs. It supports text comparisons, model-based grading such as model_graded_fact(), and custom scorers. Datasets carry input and target columns, with multimodal support across image, audio, and video. The framework targets frontier-AI capability and safety assessment across coding, reasoning, knowledge, behavior, and multimodal understanding.

Documentation GitHub

Documentation

📖

Documentation

https://inspect.aisi.org.uk/

SDKs

📦

GitHubRepository

https://github.com/UKGovernmentBEIS/inspect_ai

API entry from apis.yml

name: Inspect AI
description: Inspect AI is an open-source framework for large language model evaluations developed and
  maintained by the UK AI Security Institute (UK AISI) and Meridian Labs. It supports text comparisons,
  model-based grading such as model_graded_fact(), and custom scorers. Datasets carry input and target
  columns, with multimodal support across image, audio, and video. The framework targets frontier-AI capability
  and safety assessment across coding, reasoning, knowledge, behavior, and multimodal understanding.
humanURL: https://inspect.aisi.org.uk/
baseURL: https://inspect.aisi.org.uk
tags:
- UK AISI
- Open Source
- Frontier AI
- Model Graded
- Safety Evaluation
properties:
- type: Documentation
  url: https://inspect.aisi.org.uk/
- type: GitHubRepository
  url: https://github.com/UKGovernmentBEIS/inspect_ai