inspect_ai.tool
Tools
bash
Bash shell command execution tool.
Execute bash shell commands using a sandbox environment (e.g. “docker”).
@tool(viewer=code_viewer("bash", "cmd"))
def bash(
int | None = None, user: str | None = None, sandbox: str | None = None
timeout: -> Tool )
timeout
int | None-
Timeout (in seconds) for command.
user
str | None-
User to execute commands as.
sandbox
str | None-
Optional sandbox environmnent name.
python
Python code execution tool.
Execute Python code using a sandbox environment (e.g. “docker”).
@tool(viewer=code_viewer("python", "code"))
def python(
int | None = None, user: str | None = None, sandbox: str | None = None
timeout: -> Tool )
timeout
int | None-
Timeout (in seconds) for command.
user
str | None-
User to execute commands as.
sandbox
str | None-
Optional sandbox environmnent name.
bash_session
Bash shell session command execution tool.
Execute bash shell commands in a long running session using a sandbox environment (e.g. “docker”).
By default, a separate bash process is created within the sandbox for each call to bash_session(). You can modify this behavior by passing instance=None
(which will result in a single bash process for the entire sample) or use other instance
values that implement another scheme).
See complete documentation at https://inspect.aisi.org.uk/tools-standard.html#sec-bash-session.
@tool(viewer=code_viewer("bash", "command"))
def bash_session(*, timeout: int | None = None, instance: str | None = uuid()) -> Tool
timeout
int | None-
Timeout (in seconds) for command.
instance
str | None-
Instance id (each unique instance id has its own bash process)
text_editor
Custom editing tool for viewing, creating and editing files.
Perform text editor operations using a sandbox environment (e.g. “docker”).
IMPORTANT: This tool does not currently support Subtask isolation. This means that a change made to a file by on Subtask will be visible to another Subtask.
@tool()
def text_editor(timeout: int | None = None, user: str | None = None) -> Tool
timeout
int | None-
Timeout (in seconds) for command.
user
str | None-
User to execute commands as.
web_browser
Tools used for web browser navigation.
By default, a separate web browser process is created within the sandbox for each call to web_browser(). You can modify this behavior by passing instance=None
(which will result in a single web browser for the entire sample) or use other instance
values that implement another scheme).
See complete documentation at https://inspect.aisi.org.uk/tools-standard.html#sec-web-browser.
def web_browser(
*, interactive: bool = True, instance: str | None = uuid()
-> list[Tool] )
interactive
bool-
Provide interactive tools (enable clicking, typing, and submitting forms). Defaults to True.
instance
str | None-
Instance id (each unique instance id has its own web browser process)
computer
Desktop computer tool.
See documentation at https://inspect.aisi.org.uk/tools-standard.html#sec-computer.
@tool
def computer(max_screenshots: int | None = 1, timeout: int | None = 180) -> Tool
max_screenshots
int | None-
The maximum number of screenshots to play back to the model as input. Defaults to 1 (set to
None
to have no limit). timeout
int | None-
Timeout in seconds for computer tool actions. Defaults to 180 (set to
None
for no timeout).
web_search
Web search tool.
A tool that can be registered for use by models to search the web. Use the use_tools() solver to make the tool available (e.g. use_tools(web_search())
))
A web search is conducted using the specified provider, the results are parsed for relevance using the specified model, and the top ‘num_results’ relevant pages are returned.
See further documentation at https://inspect.aisi.org.uk/tools-standard.html#sec-web-search.
@tool
def web_search(
"google"] = "google",
provider: Literal[int = 3,
num_results: int = 3,
max_provider_calls: int = 10,
max_connections: str | None = None,
model: -> Tool )
provider
Literal['google']-
Search provider (defaults to “google”, currently the only provider). Possible future providers include “brave” and “bing”.
num_results
int-
Number of web search result pages to return to the model.
max_provider_calls
int-
Maximum number of search calls to make to the search provider.
max_connections
int-
Maximum number of concurrent connections to API endpoint of search provider.
model
str | None-
Model used to parse web pages for relevance.
think
Think tool for extra thinking.
Tool that provides models with the ability to include an additional thinking step as part of getting to its final answer.
Note that the think() tool is not a substitute for reasoning and extended thinking, but rather an an alternate way of letting models express thinking that is better suited to some tool use scenarios. Please see the documentation on using the think tool before using it in your evaluations.
@tool
def think(
str | None = None,
description: str | None = None,
thought_description: -> Tool )
description
str | None-
Override the default description of the think tool.
thought_description
str | None-
Override the default description of the thought parameter.
MCP
mcp_connection
Context manager for running MCP servers required by tools.
Any ToolSource passed in tools will be examined to see if it references an MCPServer, and if so, that server will be connected to upon entering the context and disconnected from upon exiting the context.
@contextlib.asynccontextmanager
async def mcp_connection(
| ToolDef | ToolSource] | ToolSource,
tools: Sequence[Tool -> AsyncIterator[None] )
tools
Sequence[Tool | ToolDef | ToolSource] | ToolSource-
Tools in current context.
mcp_server_stdio
MCP Server (Stdio).
Stdio interface to MCP server. Use this for MCP servers that run locally.
def mcp_server_stdio(
*,
str,
command: list[str] = [],
args: str | Path | None = None,
cwd: dict[str, str] | None = None,
env: -> MCPServer )
command
str-
The executable to run to start the server.
args
list[str]-
Command line arguments to pass to the executable.
cwd
str | Path | None-
The working directory to use when spawning the process.
env
dict[str, str] | None-
The environment to use when spawning the process in addition to the platform specific set of default environment variables (e.g. “HOME”, “LOGNAME”, “PATH”, “SHELL”, “TERM”, and “USER” for Posix-based systems).
mcp_server_sse
MCP Server (SSE).
SSE interface to MCP server. Use this for MCP servers available via a URL endpoint.
def mcp_server_sse(
*,
str,
url: dict[str, Any] | None = None,
headers: float = 5,
timeout: float = 60 * 5,
sse_read_timeout: -> MCPServer )
url
str-
URL to remote server
headers
dict[str, Any] | None-
Headers to send server (typically authorization is included here)
timeout
float-
Timeout for HTTP operations
sse_read_timeout
float-
How long (in seconds) the client will wait for a new event before disconnecting.
mcp_server_sandbox
MCP Server (Sandbox).
Interface to MCP server running in an Inspect sandbox.
def mcp_server_sandbox(
*,
str,
command: list[str] = [],
args: str | Path | None = None,
cwd: dict[str, str] | None = None,
env: str | None = None,
sandbox: -> MCPServer )
command
str-
The executable to run to start the server.
args
list[str]-
Command line arguments to pass to the executable.
cwd
str | Path | None-
The working directory to use when spawning the process.
env
dict[str, str] | None-
The environment to use when spawning the process in addition to the platform specific set of default environment variables (e.g. “HOME”, “LOGNAME”, “PATH”, “SHELL”, “TERM”, and “USER” for Posix-based systems).
sandbox
str | None-
The sandbox to use when spawning the process.
mcp_tools
Tools from MCP server.
def mcp_tools(
server: MCPServer,*,
"all"] | list[str] = "all",
tools: Literal[-> ToolSource )
server
MCPServer-
MCP server created with mcp_server_stdio() or mcp_server_sse()
tools
Literal['all'] | list[str]-
List of tool names (or globs) (defaults to “all”) which returns all tools.
MCPServer
Model Context Protocol server interface.
MCPServer can be passed in the tools
argument as a source of tools (use the mcp_tools() function to filter the list of tools)
class MCPServer(ToolSource)
Methods
- tools
-
List of all tools provided by this server.
async def tools(self) -> list[Tool]
Dynamic
tool_with
Tool with modifications to various attributes.
This function modifies the passed tool in place and returns it. If you want to create multiple variations of a single tool using tool_with() you should create the underlying tool multiple times.
def tool_with(
tool: Tool,str | None = None,
name: str | None = None,
description: dict[str, str] | None = None,
parameters: bool | None = None,
parallel: | None = None,
viewer: ToolCallViewer | None = None,
model_input: ToolCallModelInput -> Tool )
tool
Tool-
Tool instance to modify.
name
str | None-
Tool name (optional).
description
str | None-
Tool description (optional).
parameters
dict[str, str] | None-
Parameter descriptions (optional)
parallel
bool | None-
Does the tool support parallel execution (defaults to True if not specified)
viewer
ToolCallViewer | None-
Optional tool call viewer implementation.
model_input
ToolCallModelInput | None-
Optional function that determines how tool call results are played back as model input.
ToolDef
Tool definition.
class ToolDef
Attributes
tool
Callable[..., Any]-
Callable to execute tool.
name
str-
Tool name.
description
str-
Tool description.
parameters
ToolParams-
Tool parameter descriptions.
parallel
bool-
Supports parallel execution.
viewer
ToolCallViewer | None-
Custom viewer for tool call
model_input
ToolCallModelInput | None-
Custom model input presenter for tool calls.
Methods
- __init__
-
Create a tool definition.
def __init__( self, tool: Callable[..., Any],str | None = None, name: str | None = None, description: dict[str, str] | ToolParams | None = None, parameters: bool | None = None, parallel: | None = None, viewer: ToolCallViewer | None = None, model_input: ToolCallModelInput -> None )
tool
Callable[..., Any]-
Callable to execute tool.
name
str | None-
Name of tool. Discovered automatically if not specified.
description
str | None-
Description of tool. Discovered automatically by parsing doc comments if not specified.
parameters
dict[str, str] | ToolParams | None-
Tool parameter descriptions and types. Discovered automatically by parsing doc comments if not specified.
parallel
bool | None-
Does the tool support parallel execution (defaults to True if not specified)
viewer
ToolCallViewer | None-
Optional tool call viewer implementation.
model_input
ToolCallModelInput | None-
Optional function that determines how tool call results are played back as model input.
- as_tool
-
Convert a ToolDef to a Tool.
def as_tool(self) -> Tool
Types
Tool
Additional tool that an agent can use to solve a task.
class Tool(Protocol):
async def __call__(
self,
*args: Any,
**kwargs: Any,
-> ToolResult )
*args
Any-
Arguments for the tool.
**kwargs
Any-
Keyword arguments for the tool.
Examples
@tool
def add() -> Tool:
async def execute(x: int, y: int) -> int:
return x + y
return execute
ToolResult
Valid types for results from tool calls.
= (
ToolResult str
| int
| float
| bool
| ContentText
| ContentReasoning
| ContentImage
| ContentAudio
| ContentVideo
| list[ContentText | ContentReasoning | ContentImage | ContentAudio | ContentVideo]
)
ToolError
Exception thrown from tool call.
If you throw a ToolError form within a tool call, the error will be reported to the model for further processing (rather than ending the sample). If you want to raise a fatal error from a tool call use an appropriate standard exception type (e.g. RuntimeError
, ValueError
, etc.)
class ToolError(Exception)
Methods
- __init__
-
Create a ToolError.
def __init__(self, message: str) -> None
message
str-
Error message to report to the model.
ToolCallError
Error raised by a tool call.
@dataclass
class ToolCallError
Attributes
type
Literal['parsing', 'timeout', 'unicode_decode', 'permission', 'file_not_found', 'is_a_directory', 'output_limit', 'approval', 'unknown']-
Error type.
message
str-
Error message.
ToolChoice
Specify which tool to call.
“auto” means the model decides; “any” means use at least one tool, “none” means never call a tool; ToolFunction instructs the model to call a specific function.
= Union[Literal["auto", "any", "none"], ToolFunction] ToolChoice
ToolFunction
Indicate that a specific tool function should be called.
@dataclass
class ToolFunction
Attributes
name
str-
The name of the tool function to call.
ToolInfo
Specification of a tool (JSON Schema compatible)
If you are implementing a ModelAPI, most LLM libraries can be passed this object (dumped to a dict) directly as a function specification. For example, in the OpenAI provider:
ChatCompletionToolParam(type="function",
=tool.model_dump(exclude_none=True),
function )
In some cases the field names don’t match up exactly. In that case call model_dump()
on the parameters
field. For example, in the Anthropic provider:
ToolParam(=tool.name,
name=tool.description,
description=tool.parameters.model_dump(exclude_none=True),
input_schema )
class ToolInfo(BaseModel)
Attributes
name
str-
Name of tool.
description
str-
Short description of tool.
parameters
ToolParams-
JSON Schema of tool parameters object.
ToolParams
Description of tool parameters object in JSON Schema format.
class ToolParams(BaseModel)
Attributes
type
Literal['object']-
Params type (always ‘object’)
properties
dict[str, ToolParam]-
Tool function parameters.
required
list[str]-
List of required fields.
additionalProperties
bool-
Are additional object properties allowed? (always
False
)
ToolParam
Description of tool parameter in JSON Schema format.
= JSONSchema ToolParam: TypeAlias
ToolSource
Protocol for dynamically providing a set of tools.
@runtime_checkable
class ToolSource(Protocol)
Methods
- tools
-
Retrieve tools from tool source.
async def tools(self) -> list[Tool]
Decorator
tool
Decorator for registering tools.
def tool(
| None = None,
func: Callable[P, Tool] *,
str | None = None,
name: | None = None,
viewer: ToolCallViewer | None = None,
model_input: ToolCallModelInput bool = True,
parallel: str | None = None,
prompt: -> Callable[P, Tool] | Callable[[Callable[P, Tool]], Callable[P, Tool]] )
func
Callable[P, Tool] | None-
Tool function
name
str | None-
Optional name for tool. If the decorator has no name argument then the name of the tool creation function will be used as the name of the tool.
viewer
ToolCallViewer | None-
Provide a custom view of tool call and context.
model_input
ToolCallModelInput | None-
Provide a custom function for playing back tool results as model input.
parallel
bool-
Does this tool support parallel execution? (defaults to
True
). prompt
str | None-
Deprecated (provide all descriptive information about the tool within the tool function’s doc comment)
Examples
@tool
def add() -> Tool:
async def execute(x: int, y: int) -> int:
return x + y
return execute