inspect_ai.event

Transcript events.

Core Events

ModelEvent

Call to a language model.

class ModelEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['model']

Event type.

model str

Model name.

role str | None

Model role.

input list[ChatMessage]

Model input (list of messages).

input_refs list[tuple[int, int]] | None

Message pool references for input. Each element is a (start, end_exclusive) range.

tools list[ToolInfo]

Tools available to the model.

tool_choice ToolChoice

Directive to the model which tools to prefer.

config GenerateConfig

Generate config used for call to model.

output ModelOutput

Output from model.

retries int | None

Retries for the model API request.

error str | None

Error which occurred during model call.

traceback str | None

Error traceback (plain text).

traceback_ansi str | None

Error traceback with ANSI color codes for display.

cache Literal['read', 'write'] | None

Was this a cache read or write.

call ModelCall | None

Raw call made to model API.

completed UtcDatetime | None

Time that model call completed (see timestamp for started)

working_time float | None

working time for model call that succeeded (i.e. was not retried).

ToolEvent

Call to a tool.

class ToolEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['tool']

Event type.

type Literal['function']

Type of tool call (currently only ‘function’)

id str

Unique identifier for tool call.

function str

Function called.

arguments dict[str, JsonValue]

Arguments to function.

view ToolCallContent | None

Custom view of tool call input.

result ToolResult

Function return value.

truncated tuple[int, int] | None

Bytes truncated (from,to) if truncation occurred

error ToolCallError | None

Error that occurred during tool call.

completed UtcDatetime | None

Time that tool call completed (see timestamp for started)

working_time float | None

Working time for tool call (i.e. time not spent waiting on semaphores).

agent str | None

Name of agent if the tool call was an agent handoff.

agent_span_id str | None

Span ID of the agent span, if this tool call spawned an agent.

failed bool | None

Did the tool call fail with a hard error?.

message_id str | None

Id of ChatMessageTool associated with this event.

cancelled bool

Was the task cancelled?

BranchEvent

Marks where a branched trajectory’s unique content begins.

Emitted at the point where a branch transitions from replaying its parent’s prefix to live execution. Events before this in the trajectory’s span are replay-phase re-execution; events after are the branch’s genuine new content.

class BranchEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['branch']

Event type.

from_anchor str

Anchor at the branch point (matches an AnchorEvent.anchor_id in the parent).

CompactionEvent

Compaction of conversation history.

class CompactionEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['compaction']

Event type.

type Literal['summary', 'edit', 'trim']

Compaction type.

tokens_before int | None

Tokens before compaction.

tokens_after int | None

Tokens after compaction.

source str | None

Compaction source (e.g. ‘inspect’, ‘claude_code’, etc.)

ApprovalEvent

Tool approval.

class ApprovalEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['approval']

Event type

message str

Message generated by model along with tool call.

call ToolCall

Tool call being approved.

view ToolCallView | None

View presented for approval.

approver str

Aprover name.

decision Literal['approve', 'modify', 'reject', 'escalate', 'terminate']

Decision of approver.

modified ToolCall | None

Modified tool call for decision ‘modify’.

explanation str | None

Explanation for decision.

SandboxEvent

Sandbox execution or I/O

class SandboxEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['sandbox']

Event type

action Literal['exec', 'read_file', 'write_file']

Sandbox action

cmd str | None

Command (for exec)

options dict[str, JsonValue] | None

Options (for exec)

file str | None

File (for read_file and write_file)

input str | None

Input (for cmd and write_file). Truncated to 100 lines.

result int | None

Result (for exec)

output str | None

Output (for exec and read_file). Truncated to 100 lines.

completed UtcDatetime | None

Time that sandbox action completed (see timestamp for started)

InfoEvent

Event with custom info/data.

class InfoEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['info']

Event type.

source str | None

Optional source for info event.

data JsonValue

Data provided with event.

ScoreEvent

Event with score.

Can be the final score for a Sample, or can be an intermediate score resulting from a call to score.

class ScoreEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['score']

Event type.

score Score

Score value.

target str | list[str] | None

“Sample target.

intermediate bool

Was this an intermediate scoring?

scorer str | None

Name of the scorer that produced this score (unique within the task).

scorer_args dict[str, Any] | None

Arguments the scorer was instantiated with (None for scores set directly by a solver via state.scores).

model_usage dict[str, ModelUsage] | None

Cumulative model usage at the time of this score.

role_usage dict[str, ModelUsage] | None

Cumulative model usage by role at the time of this score.

LoggerEvent

Log message recorded with Python logger.

class LoggerEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['logger']

Event type.

message LoggingMessage

Logging message

ErrorEvent

Event with sample error.

class ErrorEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['error']

Event type.

error EvalError

Sample error

SpanBeginEvent

Mark the beginning of a transcript span.

class SpanBeginEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['span_begin']

Event type.

id str

Unique identifier for span.

parent_id str | None

Identifier for parent span.

type str | None

Optional ‘type’ field for span.

name str

Span name.

SpanEndEvent

Mark the end of a transcript span.

class SpanEndEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['span_end']

Event type.

id str

Unique identifier for span.

Event Tree

event_tree

Build a tree representation of a sequence of events.

Organize events heirarchially into event spans.

def event_tree(events: Sequence[Event]) -> EventTree
events Sequence[Event]

Sequence of Event.

event_tree_walk

Walk an event tree yielding nodes matching a filter.

def event_tree_walk(
    tree: EventTree,
    filter: type[T]
    | tuple[type[T], ...]
    | Callable[[EventTreeNode], bool]
    | None = None,
) -> Iterable[T] | Iterable[EventTreeNode]
tree EventTree

Event tree to walk.

filter type[T] | tuple[type[T], ...] | Callable[[EventTreeNode], bool] | None

A type, tuple of types (passed to isinstance), a predicate function, or None to yield all nodes.

event_sequence

Flatten a span forest back into a properly ordered seqeunce.

def event_sequence(tree: EventTree | EventTreeSpan) -> Iterable[Event]
tree EventTree | EventTreeSpan

Event tree or EventTreeSpan.

EventTree

Tree of events (has invividual events and event spans).

EventTree: TypeAlias = list[EventTreeNode]

EventTreeSpan

Event tree node representing a span of events.

@dataclass
class EventTreeSpan

Attributes

id str

Span id.

parent_id str | None

Parent span id.

type str | None

Optional ‘type’ field for span.

name str

Span name.

begin SpanBeginEvent

Span begin event.

end SpanEndEvent | None

Span end event (if any).

children list[EventTreeNode]

Children in the span.

EventTreeNode

Node in an event tree.

EventTreeNode: TypeAlias = Union["EventTreeSpan", Event]

Timeline

timeline_build

Build a Timeline from a flat event list.

Transforms a flat event stream into a hierarchical Timeline tree with agent-centric interpretation. The pipeline has two phases:

Phase 1 — Structure extraction:

Uses event_tree() to parse span_begin/span_end events into a tree, then looks for top-level phase spans (“init”, “solvers”, “scorers”):

  • If present, partitions events into init (setup), agent (solvers), and scoring sections.
  • If absent, treats the entire event stream as the agent.

Phase 2 — Agent classification:

Within the agent section, spans are classified as agents or unrolled:

============================== ======================================= Span type Result ============================== ======================================= type="agent" TimelineSpan(span_type="agent") type="solver" TimelineSpan(span_type="agent") type="tool" + ModelEvents TimelineSpan(span_type="agent") ToolEvent with agent field TimelineSpan(span_type="agent") type="tool" (no models) Unrolled into parent Any other span type Unrolled into parent ============================== =======================================

“Unrolled” means the span wrapper is removed and its child events dissolve into the parent’s content list.

Phase 3 — Post-processing passes:

  • Utility agent classification (single-turn agents with different system prompts)
def timeline_build(
    events: list[Event], *, name: str | None = None, description: str | None = None
) -> Timeline
events list[Event]

Flat list of Events from a transcript.

name str | None

Optional name for timeline (defaults to “Default”)

description str | None

Optional description for timeline (defaults to ““)

timeline_dump

Serialize a Timeline to a JSON-compatible dict.

Converts a Timeline into a plain dictionary suitable for JSON serialization. Event objects within the timeline are replaced by their UUIDs, keeping the serialized form compact and self-referencing.

def timeline_dump(timeline: Timeline) -> dict[str, Any]
timeline Timeline

The Timeline to serialize.

timeline_filter

Return a new timeline with only spans matching the predicate.

Recursively walks the span tree, keeping TimelineSpan items where predicate(span) returns True. Non-matching spans and their entire subtrees are pruned. TimelineEvent items are always kept (they belong to the parent span).

Use this to pre-filter a timeline before passing it to timeline_messages().

def timeline_filter(
    timeline: Timeline,
    predicate: Callable[[TimelineSpan], bool],
) -> Timeline
timeline Timeline

The timeline to filter.

predicate Callable[[TimelineSpan], bool]

Function that receives a TimelineSpan and returns True to keep it (and its subtree), False to prune it.

timeline_load

Deserialize a Timeline from a dict produced by timeline_dump.

Reconstructs a full Timeline by resolving the UUID strings stored in data back to their corresponding Event objects from events.

def timeline_load(data: dict[str, Any], events: list[Event]) -> Timeline
data dict[str, Any]

A dict previously produced by timeline_dump.

events list[Event]

The flat list of Event objects whose UUIDs appear in data. Events without a UUID are ignored.

timeline_branch

Context manager for creating a timeline branch.

Emits an AnchorEvent in the current (parent) span so the viewer can resolve from_anchor to a position, then opens a type="branch" span and emits a BranchEvent inside it.

@contextlib.asynccontextmanager
async def timeline_branch(
    *, name: str, from_anchor: str, id: str | None = None
) -> AsyncIterator[None]
name str

Name of branch span.

from_anchor str

Anchor id at the branch point.

id str | None

Optional span ID. Generated if not provided.

Timeline

A named timeline view over a transcript.

Multiple timelines allow different interpretations of the same event stream — e.g. a default agent-centric view alongside an alternative grouping or filtered view.

class Timeline(BaseModel)

Methods

render

Render an ASCII swimlane diagram of the timeline.

def render(self, width: int | None = None) -> str
width int | None

Total width of the output in characters. Defaults to 120.

TimelineEvent

Wraps a single Event.

class TimelineEvent(BaseModel)

Methods

start_time

Event timestamp (required field on all events).

def start_time(self) -> datetime
end_time

Event completion time if available, else timestamp.

def end_time(self) -> datetime
total_tokens

Tokens from this event (ModelEvent only).

Includes input_tokens_cache_read and input_tokens_cache_write in the total, as these represent actual token consumption for any LLM system using prompt caching. The sum of all token fields provides an accurate measure of total context window usage across all sources.

def total_tokens(self) -> int
idle_time

Seconds of idle time (always 0 for a single event).

def idle_time(self) -> float

TimelineSpan

A span of execution — agent, scorer, tool, or root.

class TimelineSpan(BaseModel)

Attributes

tool_invoked bool

True if this agent span was invoked as a tool (via task/as_tool/handoff).

Tool-invoked subagents are explicit user-intended sub-trajectories and are never classified as utility regardless of turn count or prompt differences. The _classify_utility_agents heuristic targets internal helper model calls, not explicit subagent invocations.

Methods

start_time

Earliest start time among content (and optionally branches).

def start_time(self, include_branches: bool = True) -> datetime
include_branches bool

Include branches in time calcluation.

end_time

Latest end time among content (and optionally branches).

def end_time(self, include_branches: bool = True) -> datetime
include_branches bool

Include branches in time calcluation.

total_tokens

Sum of tokens from content (and optionally branches).

def total_tokens(self, include_branches: bool = True) -> int
include_branches bool

Include branches in token calcluation.

idle_time

Seconds of idle time within this span (and optionally branches).

def idle_time(self, include_branches: bool = True) -> float
include_branches bool

Include branches in time calcluation.

Outline

Hierarchical outline of events for an agent.

class Outline(BaseModel)

OutlineNode

A node in an agent’s outline, referencing an event by UUID.

class OutlineNode(BaseModel)

Eval Events

SampleInitEvent

Beginning of processing a Sample.

class SampleInitEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['sample_init']

Event type.

sample Sample

Sample.

state JsonValue

Initial state.

SampleLimitEvent

The sample was unable to finish processing due to a limit

class SampleLimitEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['sample_limit']

Event type.

type Literal['message', 'time', 'working', 'token', 'cost', 'operator', 'custom']

Type of limit that halted processing

message str

A message associated with this limit

limit float | None

The limit value (if any)

StateEvent

Change to the current TaskState

class StateEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['state']

Event type.

changes list[JsonChange]

List of changes to the TaskState

StoreEvent

Change to data within the current Store.

class StoreEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['store']

Event type.

changes list[JsonChange]

List of changes to the Store.

InputEvent

Input screen interaction.

class InputEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['input']

Event type.

input str

Input interaction (plain text).

input_ansi str

Input interaction (ANSI).

ScoreEditEvent

Event recorded when a score is edited.

class ScoreEditEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['score_edit']

Event type.

score_name str

Name of the score being edited.

edit ScoreEdit

The edit being applied to the score.

InterruptEvent

Records that an agent’s turn or sample was cut short.

Emitted in three cases:

  • source="user_cancel" — an ACP client (e.g. an editor or TUI) called session/cancel while a turn was in flight.
  • source="limit" — a sample-level limit (tokens, time, cost, messages) tripped during execution.
  • source="system" — the eval is shutting down for an external reason and is cancelling active samples.

The interrupted field records what was running at the moment the cancel reached the cancel scope. interrupted_tool_call_id and interrupted_model_event_id give cross-references when applicable so downstream consumers can correlate this event with the in-flight ToolEvent or ModelEvent.

class InterruptEvent(BaseEvent)

Attributes

uuid str | None

Unique identifer for event.

span_id str | None

Span the event occurred within.

timestamp UtcDatetime

Clock time at which event occurred.

working_start float

Working time (within sample) at which the event occurred.

metadata dict[str, Any] | None

Additional event metadata.

pending bool | None

Is this event pending?

event Literal['interrupt']

Event type.

source Literal['user_cancel', 'limit', 'system']

What caused the interrupt.

interrupted Literal['generate', 'tool_call', 'between_turns']

What was running at the moment of the interrupt.

interrupted_tool_call_id str | None

ToolEvent.id (the underlying ToolCall.id) of the in-flight tool, if any.

interrupted_model_event_id str | None

ModelEvent.uuid of the in-flight model call, if any.

Types

LoggingLevel

Logging level.

LoggingLevel = Literal[
    "debug", "trace", "http", "sandbox", "info", "warning", "error", "critical"
]

LoggingMessage

Message written to Python log.

class LoggingMessage(BaseModel)

Attributes

name str | None

Logger name (e.g. ‘httpx’)

level LoggingLevel

Logging level.

message str

Log message.

created float

Message created time.

filename str

Logged from filename.

module str

Logged from module.

lineno int

Logged from line number.