inspect_ai.tool
Tools
bash
Bash shell command execution tool.
Execute bash shell commands using a sandbox environment (e.g. “docker”).
@tool(viewer=code_viewer("bash", "cmd"))
def bash(
timeout: int | None = None, user: str | None = None, sandbox: str | None = None
) -> Tooltimeoutint | None-
Timeout (in seconds) for command.
userstr | None-
User to execute commands as.
sandboxstr | None-
Optional sandbox environment name.
python
Python code execution tool.
Execute Python code using a sandbox environment (e.g. “docker”).
@tool(viewer=code_viewer("python", "code"))
def python(
timeout: int | None = None, user: str | None = None, sandbox: str | None = None
) -> Tooltimeoutint | None-
Timeout (in seconds) for command.
userstr | None-
User to execute commands as.
sandboxstr | None-
Optional sandbox environment name.
bash_session
Interactive bash shell session tool.
Interact with a bash shell in a long running session using a sandbox environment (e.g. “docker”). This tool allows sending text to the shell, which could be a command followed by a newline character or any other input text such as the response to a password prompt.
To create a separate bash process for each call to bash_session(), pass a unique value for instance
See complete documentation at https://inspect.aisi.org.uk/tools-standard.html#sec-bash-session.
@tool()
def bash_session(
*,
timeout: int | None = None, # default is max_wait + 5 seconds
wait_for_output: int | None = None, # default is 30 seconds
user: str | None = None,
instance: str | None = None,
) -> Tooltimeoutint | None-
Timeout (in seconds) for command.
wait_for_outputint | None-
Maximum time (in seconds) to wait for output. If no output is received within this period, the function will return an empty string. The model may need to make multiple tool calls to obtain all output from a given command.
userstr | None-
Username to run commands as
instancestr | None-
Instance id (each unique instance id has its own bash process)
text_editor
Custom editing tool for viewing, creating and editing files.
Perform text editor operations using a sandbox environment (e.g. “docker”).
IMPORTANT: This tool does not currently support Subtask isolation. This means that a change made to a file by on Subtask will be visible to another Subtask.
@tool()
def text_editor(timeout: int | None = None, user: str | None = None) -> Tooltimeoutint | None-
Timeout (in seconds) for command. Defaults to 180 if not provided.
userstr | None-
User to execute commands as.
web_browser
Tools used for web browser navigation.
To create a separate web browser process for each call to web_browser(), pass a unique value for instance.
See complete documentation at https://inspect.aisi.org.uk/tools-standard.html#sec-web-browser.
def web_browser(*, interactive: bool = True, instance: str | None = None) -> list[Tool]interactivebool-
Provide interactive tools (enable clicking, typing, and submitting forms). Defaults to True.
instancestr | None-
Instance id (each unique instance id has its own web browser process)
computer
Desktop computer tool.
See documentation at https://inspect.aisi.org.uk/tools-standard.html#sec-computer.
@tool
def computer(max_screenshots: int | None = 1, timeout: int | None = 180) -> Toolmax_screenshotsint | None-
The maximum number of screenshots to play back to the model as input. Defaults to 1 (set to
Noneto have no limit). timeoutint | None-
Timeout in seconds for computer tool actions. Defaults to 180 (set to
Nonefor no timeout).
web_search
Web search tool.
Web searches are executed using a provider. Providers are split into two categories:
Internal providers: “openai”, “anthropic”, “grok”, “gemini”, “perplexity”. These use the model’s built-in search capability and do not require separate API keys. These work only for their respective model provider (e.g. the “openai” search provider works only for
openai/*models).External providers: “tavily”, “google”, and “exa”. These are external services that work with any model and require separate accounts and API keys.
Internal providers will be prioritized if running on the corresponding model (e.g., “openai” provider will be used when running on openai models). If an internal provider is specified but the evaluation is run with a different model, a fallback external provider must also be specified.
See further documentation at https://inspect.aisi.org.uk/tools-standard.html#sec-web-search.
@tool
def web_search(
providers: WebSearchProvider
| WebSearchProviders
| list[WebSearchProvider | WebSearchProviders]
| None = None,
**deprecated: Unpack[WebSearchDeprecatedArgs],
) -> ToolprovidersWebSearchProvider | WebSearchProviders | list[WebSearchProvider | WebSearchProviders] | None-
Configuration for the search providers to use. Currently supported providers are “openai”, “anthropic”, “perplexity”, “tavily”, “gemini”, “grok”, “google”, and “exa”. The
providersparameter supports several formats based on either astrspecifying a provider or adictwhose keys are the provider names and whose values are the provider-specific options. A single value or a list of these can be passed. This arg is optional just for backwards compatibility. New code should always provide this argument.Single provider:
web_search("tavily") web_search({"tavily": {"max_results": 5}}) # Tavily-specific optionsMultiple providers:
# "openai" used for OpenAI models, "tavily" as fallback web_search(["openai", "tavily"]) # The True value means to use the provider with default options web_search({"openai": True, "tavily": {"max_results": 5}}Mixed format:
web_search(["openai", {"tavily": {"max_results": 5}}])When specified in the
dictformat, theNonevalue for a provider means to use the provider with default options.Provider-specific options: - openai: Supports OpenAI’s web search parameters. See https://platform.openai.com/docs/guides/tools-web-search?api-mode=responses
anthropic: Supports Anthropic’s web search parameters. See https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-search-tool#tool-definition
perplexity: Supports Perplexity’s web search parameters. See https://docs.perplexity.ai/api-reference/chat-completions-post
tavily: Supports options like
max_results,search_depth, etc. See https://docs.tavily.com/documentation/api-reference/endpoint/searchexa: Supports options like
text,model, etc. See https://docs.exa.ai/reference/answergoogle: Supports options like
num_results,max_provider_calls,max_connections, andmodelgrok: Supports X-AI’s live search parameters. See https://docs.x.ai/docs/guides/live-search#live-search
**deprecatedUnpack[WebSearchDeprecatedArgs]-
Deprecated arguments.
think
Think tool for extra thinking.
Tool that provides models with the ability to include an additional thinking step as part of getting to its final answer.
Note that the think() tool is not a substitute for reasoning and extended thinking, but rather an an alternate way of letting models express thinking that is better suited to some tool use scenarios. Please see the documentation on using the think tool before using it in your evaluations.
@tool
def think(
description: str | None = None,
thought_description: str | None = None,
) -> Tooldescriptionstr | None-
Override the default description of the think tool.
thought_descriptionstr | None-
Override the default description of the thought parameter.
MCP
mcp_connection
Context manager for running MCP servers required by tools.
Any ToolSource passed in tools will be examined to see if it references an MCPServer, and if so, that server will be connected to upon entering the context and disconnected from upon exiting the context.
@contextlib.asynccontextmanager
async def mcp_connection(
tools: Sequence[Tool | ToolDef | ToolSource] | ToolSource,
) -> AsyncIterator[None]toolsSequence[Tool | ToolDef | ToolSource] | ToolSource-
Tools in current context.
mcp_server_stdio
MCP Server (Stdio).
Stdio interface to MCP server. Use this for MCP servers that run locally.
def mcp_server_stdio(
*,
name: str | None = None,
command: str,
args: list[str] | None = None,
cwd: str | Path | None = None,
env: dict[str, str] | None = None,
) -> MCPServernamestr | None-
Human readable name for the server (defaults to
commandif not specified) commandstr-
The executable to run to start the server.
argslist[str] | None-
Command line arguments to pass to the executable.
cwdstr | Path | None-
The working directory to use when spawning the process.
envdict[str, str] | None-
The environment to use when spawning the process in addition to the platform specific set of default environment variables (e.g. “HOME”, “LOGNAME”, “PATH”, “SHELL”, “TERM”, and “USER” for Posix-based systems).
mcp_server_http
MCP Server (SSE).
HTTP interface to MCP server. Use this for MCP servers available via a URL endpoint.
def mcp_server_http(
*,
name: str | None = None,
url: str,
execution: Literal["local", "remote"] = "local",
authorization: str | None = None,
headers: dict[str, str] | None = None,
timeout: float = 5,
sse_read_timeout: float = 60 * 5,
) -> MCPServernamestr | None-
Human readable name for the server (defaults to
urlif not specified) urlstr-
URL to remote server
executionLiteral['local', 'remote']-
Where to execute tool call (“local” for within the Inspect process, “remote” for execution by the model provider – note this is currently only supported by OpenAI and Anthropic).
authorizationstr | None-
OAuth Bearer token for authentication with server.
headersdict[str, str] | None-
Headers to send server (typically authorization is included here)
timeoutfloat-
Timeout for HTTP operations
sse_read_timeoutfloat-
How long (in seconds) the client will wait for a new event before disconnecting.
mcp_server_sandbox
MCP Server (Sandbox).
Interface to MCP server running in an Inspect sandbox.
def mcp_server_sandbox(
*,
name: str | None = None,
command: str,
args: list[str] | None = None,
cwd: str | Path | None = None,
env: dict[str, str] | None = None,
sandbox: str | None = None,
timeout: int | None = None,
) -> MCPServernamestr | None-
Human readable name for server (defaults to
commandwith args if not specified). commandstr-
The executable to run to start the server.
argslist[str] | None-
Command line arguments to pass to the executable.
cwdstr | Path | None-
The working directory to use when spawning the process.
envdict[str, str] | None-
The environment to use when spawning the process in addition to the platform specific set of default environment variables (e.g. “HOME”, “LOGNAME”, “PATH”, “SHELL”, “TERM”, and “USER” for Posix-based systems).
sandboxstr | None-
The sandbox to use when spawning the process.
timeoutint | None-
Timeout (in seconds) for command.
mcp_server_sse
MCP Server (SSE).
SSE interface to MCP server. Use this for MCP servers available via a URL endpoint.
NOTE: The SEE interface has been deprecated in favor of mcp_server_http() for MCP servers at URL endpoints.
def mcp_server_sse(
*,
name: str | None = None,
url: str,
execution: Literal["local", "remote"] = "local",
authorization: str | None = None,
headers: dict[str, str] | None = None,
timeout: float = 5,
sse_read_timeout: float = 60 * 5,
) -> MCPServernamestr | None-
Human readable name for the server (defaults to
urlif not specified) urlstr-
URL to remote server
executionLiteral['local', 'remote']-
Where to execute tool call (“local” for within the Inspect process, “remote” for execution by the model provider – note this is currently only supported by OpenAI and Anthropic).
authorizationstr | None-
OAuth Bearer token for authentication with server.
headersdict[str, str] | None-
Headers to send server (typically authorization is included here)
timeoutfloat-
Timeout for HTTP operations
sse_read_timeoutfloat-
How long (in seconds) the client will wait for a new event before disconnecting.
mcp_tools
Tools from MCP server.
def mcp_tools(
server: MCPServer,
*,
tools: Literal["all"] | list[str] = "all",
) -> ToolSourceserverMCPServer-
MCP server created with mcp_server_stdio(), mcp_server_http(), or mcp_server_sandbox().
toolsLiteral['all'] | list[str]-
List of tool names (or globs) (defaults to “all”) which returns all tools.
MCPServer
Model Context Protocol server interface.
MCPServer can be passed in the tools argument as a source of tools (use the mcp_tools() function to filter the list of tools)
class MCPServer(ToolSource, AbstractAsyncContextManager["MCPServer"])Methods
- tools
-
List of all tools provided by this server
@abc.abstractmethod async def tools(self) -> list[Tool]
MCPServerConfig
Configuration for MCP server.
class MCPServerConfig(BaseModel)Attributes
typeLiteral['stdio', 'http', 'sse']-
Server type.
namestr-
Human readable server name.
toolsLiteral['all'] | list[str]-
Tools to make available from server (“all” for all tools).
MCPServerConfigStdio
Configuration for MCP servers with stdio interface.
class MCPServerConfigStdio(MCPServerConfig)Attributes
typeLiteral['stdio']-
Server type.
commandstr-
The executable to run to start the server.
argslist[str]-
Command line arguments to pass to the executable.
cwdstr | Path | None-
The working directory to use when spawning the process.
envdict[str, str] | None-
The environment to use when spawning the process in addition to the platform specific set of default environment variables (e.g. “HOME”, “LOGNAME”, “PATH”,“SHELL”, “TERM”, and “USER” for Posix-based systems)
MCPServerConfigHTTP
Conifguration for MCP servers with HTTP interface.
class MCPServerConfigHTTP(MCPServerConfig)Attributes
typeLiteral['http', 'sse']-
Server type.
urlstr-
URL for remote server.
headersdict[str, str] | None-
Headers for remote server (type “http” or “sse”)
Dynamic
tool_with
Tool with modifications to various attributes.
This function modifies the passed tool in place and returns it. If you want to create multiple variations of a single tool using tool_with() you should create the underlying tool multiple times.
def tool_with(
tool: Tool,
name: str | None = None,
description: str | None = None,
parameters: dict[str, str] | None = None,
parallel: bool | None = None,
viewer: ToolCallViewer | None = None,
model_input: ToolCallModelInput | None = None,
) -> TooltoolTool-
Tool instance to modify.
namestr | None-
Tool name (optional).
descriptionstr | None-
Tool description (optional).
parametersdict[str, str] | None-
Parameter descriptions (optional)
parallelbool | None-
Does the tool support parallel execution (defaults to True if not specified)
viewerToolCallViewer | None-
Optional tool call viewer implementation.
model_inputToolCallModelInput | None-
Optional function that determines how tool call results are played back as model input.
ToolDef
Tool definition.
class ToolDefAttributes
toolCallable[..., Any]-
Callable to execute tool.
namestr-
Tool name.
descriptionstr-
Tool description.
parametersToolParams-
Tool parameter descriptions.
parallelbool-
Supports parallel execution.
viewerToolCallViewer | None-
Custom viewer for tool call
model_inputToolCallModelInput | None-
Custom model input presenter for tool calls.
optionsdict[str, object] | None-
Optional property bag that can be used by the model provider to customize the implementation of the tool
Methods
- __init__
-
Create a tool definition.
def __init__( self, tool: Callable[..., Any], name: str | None = None, description: str | None = None, parameters: dict[str, str] | ToolParams | None = None, parallel: bool | None = None, viewer: ToolCallViewer | None = None, model_input: ToolCallModelInput | None = None, options: dict[str, object] | None = None, ) -> NonetoolCallable[..., Any]-
Callable to execute tool.
namestr | None-
Name of tool. Discovered automatically if not specified.
descriptionstr | None-
Description of tool. Discovered automatically by parsing doc comments if not specified.
parametersdict[str, str] | ToolParams | None-
Tool parameter descriptions and types. Discovered automatically by parsing doc comments if not specified.
parallelbool | None-
Does the tool support parallel execution (defaults to True if not specified)
viewerToolCallViewer | None-
Optional tool call viewer implementation.
model_inputToolCallModelInput | None-
Optional function that determines how tool call results are played back as model input.
optionsdict[str, object] | None-
Optional property bag that can be used by the model provider to customize the implementation of the tool
- as_tool
-
Convert a ToolDef to a Tool.
def as_tool(self) -> Tool
Types
Tool
Additional tool that an agent can use to solve a task.
class Tool(Protocol):
async def __call__(
self,
*args: Any,
**kwargs: Any,
) -> ToolResult*argsAny-
Arguments for the tool.
**kwargsAny-
Keyword arguments for the tool.
Examples
@tool
def add() -> Tool:
async def execute(x: int, y: int) -> int:
return x + y
return executeToolResult
Valid types for results from tool calls.
ToolResult = (
str
| int
| float
| bool
| ContentText
| ContentImage
| ContentAudio
| ContentVideo
| list[ContentText | ContentImage | ContentAudio | ContentVideo]
)ToolError
Exception thrown from tool call.
If you throw a ToolError form within a tool call, the error will be reported to the model for further processing (rather than ending the sample). If you want to raise a fatal error from a tool call use an appropriate standard exception type (e.g. RuntimeError, ValueError, etc.)
class ToolError(Exception)Methods
- __init__
-
Create a ToolError.
def __init__(self, message: str) -> Nonemessagestr-
Error message to report to the model.
ToolCallError
Error raised by a tool call.
@dataclass
class ToolCallErrorAttributes
typeLiteral['parsing', 'timeout', 'unicode_decode', 'permission', 'file_not_found', 'is_a_directory', 'limit', 'approval', 'unknown', 'output_limit']-
Error type.
messagestr-
Error message.
ToolChoice
Specify which tool to call.
“auto” means the model decides; “any” means use at least one tool, “none” means never call a tool; ToolFunction instructs the model to call a specific function.
ToolChoice = Union[Literal["auto", "any", "none"], ToolFunction]ToolFunction
Indicate that a specific tool function should be called.
@dataclass
class ToolFunctionAttributes
namestr-
The name of the tool function to call.
ToolInfo
Specification of a tool (JSON Schema compatible)
If you are implementing a ModelAPI, most LLM libraries can be passed this object (dumped to a dict) directly as a function specification. For example, in the OpenAI provider:
ChatCompletionToolParam(
type="function",
function=tool.model_dump(exclude_none=True),
)In some cases the field names don’t match up exactly. In that case call model_dump() on the parameters field. For example, in the Anthropic provider:
ToolParam(
name=tool.name,
description=tool.description,
input_schema=tool.parameters.model_dump(exclude_none=True),
)class ToolInfo(BaseModel)Attributes
namestr-
Name of tool.
descriptionstr-
Short description of tool.
parametersToolParams-
JSON Schema of tool parameters object.
optionsdict[str, Any] | None-
Optional property bag that can be used by the model provider to customize the implementation of the tool
ToolParams
Description of tool parameters object in JSON Schema format.
class ToolParams(BaseModel)Attributes
typeLiteral['object']-
Params type (always ‘object’)
propertiesdict[str, ToolParam]-
Tool function parameters.
requiredlist[str]-
List of required fields.
additionalPropertiesbool-
Are additional object properties allowed? (always
False)
ToolParam
Description of tool parameter in JSON Schema format.
ToolParam: TypeAlias = JSONSchemaToolSource
Protocol for dynamically providing a set of tools.
@runtime_checkable
class ToolSource(Protocol)Methods
- tools
-
Retrieve tools from tool source.
async def tools(self) -> list[Tool]
WebSearchProviders
Provider configuration for web_search() tool.
The web_search() tool provides models the ability to enhance their context window by performing a search. Web searches are executed using a provider. Providers are split into two categories:
Internal providers:
"openai","anthropic","gemini","grok", and"perplexity"- these use the model’s built-in search capability and do not require separate API keys. These work only for their respective model provider (e.g. the “openai” search provider works only foropenai/*models).External providers:
"tavily","exa", and"google". These are external services that work with any model and require separate accounts and API keys. Note that “google” is different from “gemini” - “google” refers to Google’s Programmable Search Engine service, while “gemini” refers to Google’s built-in search capability for Gemini models.
Internal providers will be prioritized if running on the corresponding model (e.g., “openai” provider will be used when running on openai models). If an internal provider is specified but the evaluation is run with a different model, a fallback external provider must also be specified.
class WebSearchProviders(TypedDict, total=False)Attributes
openaidict[str, Any] | Literal[True]-
Use OpenAI internal provider. For available options see https://platform.openai.com/docs/guides/tools-web-search?api-mode=responses.
anthropicdict[str, Any] | Literal[True]-
Use Anthropic internal provider. For available options see https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-search-tool.
grokdict[str, Any] | Literal[True]-
Use Grok internal provider. For available options see https://docs.x.ai/docs/guides/live-search.
geminidict[str, Any] | Literal[True]-
Use Gemini internal provider. For available options see https://ai.google.dev/gemini-api/docs/google-search.
perplexitydict[str, Any] | Literal[True]-
Use Perplexity internal provider. For available options see https://docs.perplexity.ai/api-reference/chat-completions-post
tavilydict[str, Any] | Literal[True]-
Use Tavili external provider. For available options see <Use Exa external provider. For available options see https://inspect.aisi.org.uk/tools-standard.html#tavili-options.
googledict[str, Any] | Literal[True]-
Use Google external provider. For available options see https://inspect.aisi.org.uk/tools-standard.html#google-options.
exadict[str, Any] | Literal[True]-
Use Exa external provider. For available options see https://inspect.aisi.org.uk/tools-standard.html#exa-options.
Decorator
tool
Decorator for registering tools.
def tool(
func: Callable[P, Tool] | None = None,
*,
name: str | None = None,
viewer: ToolCallViewer | None = None,
model_input: ToolCallModelInput | None = None,
parallel: bool = True,
prompt: str | None = None,
) -> Callable[P, Tool] | Callable[[Callable[P, Tool]], Callable[P, Tool]]funcCallable[P, Tool] | None-
Tool function
namestr | None-
Optional name for tool. If the decorator has no name argument then the name of the tool creation function will be used as the name of the tool.
viewerToolCallViewer | None-
Provide a custom view of tool call and context.
model_inputToolCallModelInput | None-
Provide a custom function for playing back tool results as model input.
parallelbool-
Does this tool support parallel execution? (defaults to
True). promptstr | None-
Deprecated (provide all descriptive information about the tool within the tool function’s doc comment)
Examples
@tool
def add() -> Tool:
async def execute(x: int, y: int) -> int:
return x + y
return execute