Sandboxes

k8s SandboxUK AISI
Python package that provides a Kubernetes sandbox environment for Inspect.
EC2 SandboxUK AISI
Python package that provides a EC2 virtual machine sandbox environment for Inspect.
Modal SandboxMeridian
Serverless container sandbox for Inspect using Modal’s cloud infrastructure.
Proxmox SandboxUK AISI
Use virtual machines, running within a Proxmox instance, as Inspect sandboxes.
Inspect Policy SandboxArnab Mitra
Sandbox wrapper that allows fine grained control over command execution and file I/O.

Analysis

Inspect ScoutMeridian
Transcript analysis for Inspect evaluations.
Inspect VizMeridian
Interactive data visualization for Inspect evaluations.
DocentTransluce
Tools to summarize, cluster, and search over agent transcripts.
LunetteFulcrum Research
Platform for understanding and improving agents.
Inspect WandBArcadia
Integration with Weights and Biases platform.

Frameworks

Inspect SWEMeridian
Software engineering agents (Claude Code and Codex CLI) for Inspect.
Inspect CyberUK AISI
Python package that streamlines the process of creating agentic cyber evaluations in Inspect.
PetriAnthropic
Framework testing alignment hypotheses end‑to‑end, including automatic scenario generation.
Control ArenaUK AISI
Framework for running experiments on AI Control and Monitoring.

Tooling

Inspect FlowMeridian
Workflow orchestration for reprocibly running evals at scale.
EvaljobsHugging Face
Run evals on Hugging Face GPUs and share results and code on the Hugging Face Hub.
Inspect VS CodeMeridian
VS Code extension that assists with developing and debugging Inspect evaluations.

Evals

Inspect EvalsUK AISI
Over 1000 LLM evaluations covering safety, coding, reasoning, knowledge, and agent capabilities.
OpenBenchGroq
Standardized, reproducible benchmarking for LLMs across 30+ evals.
Inspect HarborMeridian
Evals from Harbor framework including terminal-bench, replicationbench, and compilebench.