Agent Bridge

Note

The agent_bridge() and sandbox_agent_bridge() functions described below are available only in the development version of Inspect. To install the development version from GitHub:

pip install git+https://github.com/UKGovernmentBEIS/inspect_ai

Note that a previous (and now deprecated) variation of the agent bridge is available in all versions of Inspect via the bridge() function.

Overview

While Inspect provides facilities for native agent development, you can also very easily integrate agents created with 3rd party frameworks like LangChain, or use fully custom agents you have developed or ported from a research paper. You can also use CLI based agents that run within sandboxes (e.g. Codex CLI).

Agents are bridged into Inspect such that their native model calling functions are routed through the current Inspect model provider. There are two types of agent bridges supported:

  1. Bridging to Python-based agents that run in the same process as Inspect via the agent_bridge() context manager.

  2. Bridging to agents that run in a sandbox via the sandbox_agent_bridge() context manager (these agents can be written in any language).

We’ll cover each of these configurations in turn below.

Agent Bridge

To bridge a Python based agent running in the same process as Inspect:

  1. Write your custom Python agent as normal using the OpenAI connector provided by your agent system, specifying “inspect” as the model name. Note that both the Completions API and Responses API are supported.

  2. Run your custom Python agent within the agent_bridge() context manager which redirects OpenAI calls to the current Inspect model provider.

For example, here we build an agent that uses the OpenAI SDK directly (imaging using your favourite agent framework in its place):

from openai import AsyncOpenAI
from inspect_ai.agent import (
    Agent, AgentState, agent, agent_bridge
)
from inspect_ai.model import messages_to_openai

@agent
def my_agent() -> AgentState:
    async def execute(state: AgentState) -> AgentState:
        async with agent_bridge(state) as bridge:
            client = AsyncOpenAI()
            
            await client.chat.completions.create(
                model="inspect",
                messages=messages_to_openai(state.messages),
            )

            return bridge.state

    return execute
1
Use the agent_bridge() context manager to redirect the OpenAI API to the Inspect model provider. Pass the state so that the bridge can automatically keep track of changes to messages and output based on model calls passing through the bridge.
2
Use the OpenAI API with model="inspect", which enables Inspect to intercept the request and send it to the Inspect model being evaluated for the task.
3
Convert the state.messages input into native OpenAI messages using the messages_to_openai() function.

The following examples further demonstrate how to integrate other agent frameworks with Inspect:

LangChain Demonstrates using a native LangChain agent with Inspect to perform Q/A using the Tavili Search API
Codex CLI Demonstrates using the Codex CLI agent with Inspect to explore a Kali Linux system.

Sandbox Bridge

To bridge an agent running within a sandbox into Inspect:

  1. Configure your sandbox (e.g. via its Dockerfile) to contain the agent that you want to run. The agent should be configured to talk to the OpenAI API on localhost port 3131 (e.g. OPENAI_BASE_URL=http://localhost:13131/v1). Note that both the Completions API and Responses API are supported.

  2. Write a standard Inspect agent that uses the sandbox_agent_bridge() context manager and the sandbox().exec() method to invoke the custom agent.

The sandbox bridge works via running a proxy server inside the sandbox container which receives requests for the OpenAI Completions API. This proxy server in turn relays requests to the current Inspect model provider.

For example, here we build an agent that runs a custom agent binary (passing it input on the command line and reading output from stdout):

from openai import AsyncOpenAI
from inspect_ai.agent import (
    Agent, AgentState, agent, sandbox_agent_bridge
)
from inspect_ai.model import user_prompt
from inspect_ai.util import sandbox

@agent
def my_agent() -> AgentState:
    async def execute(state: AgentState) -> AgentState:
        async with sandbox_agent_bridge(state) as bridge:
            
            prompt = user_prompt(state.messages)
            
            result = sandbox().exec(
                cmd=[
                    "/opt/my_agent",
                    "--prompt",
                    prompt.text
                ],
                env={"OPENAI_BASE_URL": f"http://localhost:{bridge.port}/v1"}
            )
            if not result.success:
                raise RuntimeError(f"Agent error: {result.stderr}")

            return bridge.state

    return execute
1
Use the sandbox_agent_bridge() context manager to redirect the OpenAI API to the Inspect model provider. Pass the state so that the bridge can automatically keep track of changes to messages and output based on model calls passing through the bridge.
2
Extract the last user message from the message history with user_prompt().
3
Run the agent, using a CLI argument for input and stdout for output (other agents may use more sophisticated encoding schemes for messages in and out).
4
Redirect the OpenAI API to talk to a proxy server that communicates back to the current Inspect model provider. Note that we read the port to listen on from the bridge yielded by the context manager.

The Codex CLI example provides a more in-depth demonstration of running custom agents in sandboxes.

Models

As demonstrated above, communication with Inspect models is done by using the OpenAI API with model="inspect". You can use the same technique to interface with other Inspect models. To do this, preface the model name with “inspect” followed by the rest of the fully qualified model name.

For example, in a LangChain agent, you would do this to utilise the Inspect interface to Gemini:

model = ChatOpenAI(model="inspect/google/gemini-1.5-pro")

Transcript

Custom agents run through a bridge still get most of the benefit of the Inspect transcript and log viewer. All model calls are captured and produce the same transcript output as when using conventional agents.

If you want to use additional features of Inspect transcripts (e.g. spans, markdown output, etc.) you can still import and use the transcript function as normal. For example:

from inspect_ai.log import transcript

transcript().info("custom *markdown* content")