Inspect AI
Inspect AI is an open-source framework for large language model evaluations developed and maintained by the UK AI Security Institute (UK AISI) and Meridian Labs. It supports text comparisons, model-based grading such as model_graded_fact(), and custom scorers. Datasets carry input and target columns, with multimodal support across image, audio, and video. The framework targets frontier-AI capability and safety assessment across coding, reasoning, knowledge, behavior, and multimodal understanding.