Scanner API
Scanner
Scanner
Scan transcript content.
class Scanner(Protocol[T]):
def __call__(self, input: T, /) -> Awaitable[Result | list[Result]]inputT-
Input to scan.
ScannerInput
Union of all valid scanner input types.
ScannerInput = Union[
Transcript,
ChatMessage,
Sequence[ChatMessage],
Event,
Sequence[Event],
Timeline,
Sequence[Timeline],
]Result
Scan result.
class Result(BaseModel)Attributes
uuidstr | None-
Unique identifer for scan result.
valueJsonValue-
Scan value.
answerstr | None-
Answer extracted from model output (optional)
explanationstr | None-
Explanation of result (optional).
metadatadict[str, Any] | None-
Additional metadata related to the result (optional)
referenceslist[Reference]-
References to relevant messages or events.
labelstr | None-
Label for result to indicate its origin.
typestr | None-
Type to designate contents of ‘value’ (used in
value_typefield in result data frames).
Reference
Reference to scanned content.
class Reference(BaseModel)Attributes
typeLiteral['message', 'event']-
Reference type.
citestr | None-
Cite text used when the entity was referenced (optional).
For example, a model may have pointed to a message using something like [M22], which is the cite.
idstr-
Reference id (message or event id)
Error
Scan error (runtime error which occurred during scan).
class Error(BaseModel)Attributes
transcript_idstr-
Target transcript id.
scannerstr-
Scanner name.
errorstr-
Error message.
tracebackstr-
Error traceback.
refusalbool-
Was this error a refusal.
Loader
Load transcript data.
class Loader(Protocol[TLoaderResult]):
def __call__(
self,
transcript: Transcript,
) -> AsyncIterator[TLoaderResult]transcriptTranscript-
Transcript to yield from.
Scanners
llm_scanner
Create a scanner that uses an LLM to scan transcripts.
This scanner presents a conversation transcript to an LLM along with a custom prompt and answer specification, enabling automated analysis of conversations for specific patterns, behaviors, or outcomes.
Messages are extracted via transcript_messages(), which automatically selects the best strategy (timelines → events → raw messages) and respects context window limits. Each segment is scanned independently with globally unique message numbering ([M1], [M2], …) across all segments.
@scanner(messages="all")
def llm_scanner(
*,
question: str | Callable[[Transcript], Awaitable[str]] | None = None,
answer: AnswerSpec,
value_to_float: ValueToFloat | None = None,
template: str | None = None,
template_variables: dict[str, Any]
| Callable[[Transcript], dict[str, Any]]
| None = None,
preprocessor: MessagesPreprocessor[Transcript] | None = None,
model: str | Model | None = None,
model_role: str | None = None,
retry_refusals: bool | int = 3,
name: str | None = None,
content: TranscriptContent | None = None,
context_window: int | None = None,
compaction: Literal["all", "last"] | int = "all",
depth: int | None = None,
reducer: Callable[[list[Result]], Awaitable[Result]] | None = None,
) -> Scanner[Transcript]questionstr | Callable[[Transcript], Awaitable[str]] | None-
Question for the scanner to answer. Can be a static string (e.g., “Did the assistant refuse the request?”) or a function that takes a Transcript and returns an string for dynamic questions based on transcript content. Can be omitted if you provide a custom template.
answerAnswerSpec-
Specification of the answer format. Pass “boolean”, “numeric”, or “string” for a simple answer; pass
list[str]for a set of labels; or passMultiLabelsfor multi-classification. value_to_floatValueToFloat | None-
Optional function to convert the answer value to a float.
templatestr | None-
Overall template for scanner prompt. The scanner template should include the following variables: - {{ question }} (question for the model to answer) - {{ messages }} (transcript message history as string) - {{ answer_prompt }} (prompt for a specific type of answer). - {{ answer_format }} (instructions on how to format the answer) In addition, scanner templates can bind to any data within
Transcript.metadata(e.g. {{ metadata.score }}) template_variablesdict[str, Any] | Callable[[Transcript], dict[str, Any]] | None-
Additional variables to make available in the template. Optionally takes a function which receives the current Transcript which can return variables.
preprocessorMessagesPreprocessor[Transcript] | None-
Transform conversation messages before analysis. Controls exclusion of system messages, reasoning tokens, and tool calls. Defaults to removing system messages. Note: custom
transformfunctions that accept a Transcript are not supported whencontext_windowor timeline scanning produces multiple segments, as the transform receiveslist[ChatMessage]per segment. modelstr | Model | None-
Optional model specification. Can be a model name string or
Modelinstance. If None, uses the default model. model_rolestr | None-
Optional model role for role-based model resolution. When set, the model is resolved via
get_model(model, role=model_role)at scan time, allowing deferred role resolution when roles are not yet available at scanner construction time. retry_refusalsbool | int-
Retry model refusals. Pass an
intfor number of retries (defaults to 3). PassFalseto not retry refusals. If the limit of refusals is exceeded then aRuntimeErroris raised. namestr | None-
Scanner name. Use this to assign a name when passing llm_scanner() directly to scan() rather than delegating to it from another scanner.
contentTranscriptContent | None-
Override the transcript content filters for this scanner. For example,
TranscriptContent(timeline=True)requests timeline data so the scanner can process span-level segments. context_windowint | None-
Override the model’s context window size for chunking. When set, transcripts exceeding this limit are split into multiple segments, each scanned independently.
compactionLiteral['all', 'last'] | int-
How to handle compaction boundaries when extracting messages from events.
"all"(default) scans all compaction segments;"last"scans only the most recent. depthint | None-
Maximum depth of the span tree to process when timelines are present.
None(default) processes all depths. Ignored for events-only or messages-only transcripts. reducerCallable[[list[Result]], Awaitable[Result]] | None-
Custom reducer for aggregating multi-segment results. Accepts any
Callable[[list[Result]], Awaitable[Result]]. If None, uses a default reducer based on the answer type (e.g.,ResultReducer.anyfor boolean,ResultReducer.meanfor numeric). Standard reducers are available on :class:ResultReducer. Timeline and resultset answers bypass reduction and return a resultset.
grep_scanner
Pattern-based transcript scanner.
Scans transcript messages and events for text patterns using grep-style matching. By default, patterns are treated as literal strings (like grep without -E). Set regex=True to treat patterns as regular expressions.
What gets searched depends on what’s populated in the transcript: - Messages are searched if transcript.messages is populated - Events are searched if transcript.events is populated - Control population via @scanner(messages=..., events=...)
@scanner(messages="all")
def grep_scanner(
pattern: str | list[str] | dict[str, str | list[str]],
*,
regex: bool = False,
ignore_case: bool = True,
word_boundary: bool = False,
name: str | None = None,
) -> Scanner[Transcript]patternstr | list[str] | dict[str, str | list[str]]-
Pattern(s) to search for. Can be: - str: Single pattern - list[str]: Multiple patterns (OR logic, single aggregated result) - dict[str, str | list[str]]: Labeled patterns (returns multiple results, one per label)
regexbool-
If True, treat patterns as regular expressions. Default False (literal string matching).
ignore_casebool-
Case-insensitive matching. Default True (like grep -i).
word_boundarybool-
Match whole words only (adds nchors). Default False.
namestr | None-
Scanner name. Use this to assign a name when passing llm_scanner() directly to scan() rather than delegating to it from another scanner.
Examples
Simple pattern (messages only): grep_scanner(“error”)
Multiple patterns (OR logic): grep_scanner([“error”, “failed”, “exception”])
Labeled patterns (separate results): grep_scanner({ “errors”: [“error”, “failed”], “warnings”: [“warning”, “caution”], })
With regex: grep_scanner(r”https?://+“, regex=True)
Search both messages and events: @scanner(messages=“all”, events=[“tool”, “error”]) def find_errors() -> Scanner[Transcript]: return grep_scanner(“error”)
Scanner Tools
message_numbering
Create a messages_as_str / extract_references pair with shared numbering.
Returns two functions that share a closure-captured counter and id_map. Each call to the returned messages_as_str auto-increments message IDs globally, so multiple calls produce M1..M5, then M6..M10, etc.
The returned extract_references resolves citations from ANY prior messages_as_str call within this numbering scope.
def message_numbering(
preprocessor: MessagesPreprocessor[list[ChatMessage]] | None = None,
) -> tuple[MessagesAsStr, ExtractReferences]preprocessorMessagesPreprocessor[list[ChatMessage]] | None-
Message preprocessing options applied to every call (e.g., exclude_system, exclude_reasoning). Defaults to excluding system messages when no preprocessor is provided.
scanner_prompt
Render the default scanner prompt template.
Combines the transcript messages, question, and answer type into a complete prompt using the standard scanner template. Use this when you want the default prompt structure but need control over generation (via generate_answer()) or other scanning steps.
For fully custom prompt templates, use answer_type() to access the .prompt and .format strings directly.
def scanner_prompt(
messages: str,
question: str,
answer: AnswerSpec | Answer,
) -> strmessagesstr-
Pre-formatted message string (from
messages_as_strormessage_numbering). questionstr-
Question for the scanner to answer.
answerAnswerSpec | Answer-
Answer specification or resolved Answer object.
generate_answer
Generate a model response, optionally parsing it into a Result.
Dispatches to the appropriate generation strategy based on the answer type: tool-based structured generation for :class:AnswerStructured, standard text generation with refusal retry for all other types.
async def generate_answer(
prompt: str | list[ChatMessage],
answer: AnswerSpec,
*,
model: str | Model | None = None,
retry_refusals: int = 3,
parse: bool = True,
extract_refs: Callable[[str], list[Reference]] | None = None,
value_to_float: ValueToFloat | None = None,
) -> Result | ModelOutputpromptstr | list[ChatMessage]-
The scanning prompt (string or message list).
answerAnswerSpec-
Answer specification (
"boolean","numeric","string",list[str], or :class:AnswerStructured). Determines both the generation strategy and (when parsing) how to extract the result. modelstr | Model | None-
Model to use for generation. Can be a model name string or
Modelinstance. IfNone, uses the default model. retry_refusalsint-
Number of times to retry on model refusals (
stop_reason == "content_filter"). parsebool-
When
True(default), parse the model output into a :class:Result. WhenFalse, return the raw :class:ModelOutput. extract_refsCallable[[str], list[Reference]] | None-
Function to extract
[M1]-style references from the explanation text. Only used whenparse=True. Defaults to a no-op that returns no references. value_to_floatValueToFloat | None-
Optional function to convert the parsed value to a float. Only used when
parse=True.
parse_answer
Parse a model output into a Result using the answer specification.
Delegates to the answer type’s parsing logic. This is a pure function — no LLM call is made.
def parse_answer(
output: ModelOutput,
answer: AnswerSpec,
extract_refs: Callable[[str], list[Reference]],
value_to_float: ValueToFloat | None = None,
) -> ResultoutputModelOutput-
The model’s response to parse.
answerAnswerSpec-
Answer specification (
"boolean","numeric","string",list[str], or :class:AnswerStructured). extract_refsCallable[[str], list[Reference]]-
Function to extract
[M1]-style references from the explanation text. value_to_floatValueToFloat | None-
Optional function to convert the parsed value to a float.
ResultReducer
Standard reducers for aggregating multi-segment scan results.
Each static method is an async reducer with signature (list[Result]) -> Result. The :meth:llm factory returns a reducer that uses an LLM to synthesize results.
class ResultReducerMethods
- mean
-
Arithmetic mean of numeric values.
@staticmethod async def mean(results: list[Result]) -> Resultresultslist[Result]
- median
-
Median of numeric values.
@staticmethod async def median(results: list[Result]) -> Resultresultslist[Result]
- mode
-
Mode (most common) of numeric values.
@staticmethod async def mode(results: list[Result]) -> Resultresultslist[Result]
- max
-
Maximum of numeric values.
@staticmethod async def max(results: list[Result]) -> Resultresultslist[Result]
- min
-
Minimum of numeric values.
@staticmethod async def min(results: list[Result]) -> Resultresultslist[Result]
- any
-
Boolean OR — True if any result is True.
@staticmethod async def any(results: list[Result]) -> Resultresultslist[Result]
- union
-
Union of list values, deduplicated, preserving order.
@staticmethod async def union(results: list[Result]) -> Resultresultslist[Result]
- majority
-
Most common value, with last-result tiebreaker.
Counts occurrences of each
valueacross results. If there is a unique winner it is used; otherwise the last result’s value breaks the tie. Theansweris taken from the matching result.@staticmethod async def majority(results: list[Result]) -> Resultresultslist[Result]
- last
-
Return the last result with merged auxiliary fields.
@staticmethod async def last(results: list[Result]) -> Resultresultslist[Result]
- llm
-
Factory that returns an LLM-based reducer.
The returned reducer formats all segment results into a prompt and asks the model to synthesize the best answer.
@staticmethod def llm( model: str | Model | None = None, prompt: str | None = None, ) -> Callable[[list[Result]], Awaitable[Result]]modelstr | Model | None-
Model to use for synthesis. Defaults to the default model.
promptstr | None-
Custom synthesis prompt template. If None, uses a default template.
Answer
Protocol for LLM scanner answer types.
@runtime_checkable
class Answer(Protocol)Attributes
promptstr-
Return the answer prompt string.
formatstr-
Return the answer format string.
Methods
- result_for_answer
-
Extract and return the result from model output.
def result_for_answer( self, output: ModelOutput, extract_references: Callable[[str], list[Reference]], value_to_float: ValueToFloat | None = None, ) -> ResultoutputModelOutputextract_referencesCallable[[str], list[Reference]]value_to_floatValueToFloat | None
answer_type
Resolve an answer specification into an Answer object.
The returned object exposes .prompt and .format properties containing the prompt text and format instructions for the answer type.
def answer_type(
answer: Literal["boolean", "numeric", "string"]
| list[str]
| AnswerMultiLabel
| AnswerStructured,
) -> AnsweranswerLiteral['boolean', 'numeric', 'string'] | list[str] | AnswerMultiLabel | AnswerStructured-
Answer specification (
"boolean","numeric","string",list[str], AnswerMultiLabel, or AnswerStructured).
AnswerSpec
Specification of the answer format for an LLM scanner.
Pass "boolean", "numeric", or "string" for a simple answer; pass list[str] for a set of labels; pass :class:AnswerMultiLabel for multi-classification; or pass :class:AnswerStructured for tool-based structured output.
AnswerSpec: TypeAlias = _TextualAnswerSpec | AnswerStructuredAnswerMultiLabel
Label descriptions for LLM scanner multi-classification.
class AnswerMultiLabel(NamedTuple)Attributes
labelslist[str]-
List of label descriptions.
Label values (e.g. A, B, C) will be provided automatically.
allow_nonebool-
Allow the model to respond with NONE when no labels apply.
AnswerStructured
Answer with structured output.
Structured answers are objects that conform to a JSON Schema.
class AnswerStructured(NamedTuple)Attributes
typetype[BaseModel] | type[list]-
Pydantic BaseModel that defines the type of the answer.
Can be a single Pydantic model result or a list of results.
See the docs on Structured Answers for more details on defining types.
See the docs on Multiple Results for details on prompting scanners to yield multiple results from a scan.
answer_toolstr-
Customize the name of the answer tool provided to the model.
answer_promptstr-
Template for prompt that precedes the question posed to the scanner (use the {{ answer_tool }} variable to refer to the name of the answer tool).
answer_formatstr-
Template for instructions on answer format to place at the end of the scanner prompt (use the {{ answer_tool }} variable to refer to the name of the answer tool).
max_attemptsint-
Maximum number of times to re-prompt the model to generate the correct schema.
Utils
messages_as_str
Concatenate list of chat messages into a string.
async def messages_as_str(
input: T,
*,
preprocessor: MessagesPreprocessor[T] | None = None,
include_ids: Literal[True] | None = None,
as_json: bool = False,
) -> str | tuple[str, Callable[[str], list[Reference]]]inputT-
The Transcript with the messages or a list of messages.
preprocessorMessagesPreprocessor[T] | None-
Content filter for messages.
include_idsLiteral[True] | None-
If True, prepend ordinal references (e.g., [M1], [M2]) to each message and return a function to extract references from text. If None (default), return plain formatted string.
as_jsonbool-
If True, output as JSON string instead of plain text.
MessageFormatOptions
Message formatting options for controlling message content display.
These options control which parts of messages are included when formatting messages to strings.
@dataclass(frozen=True)
class MessageFormatOptionsAttributes
exclude_systembool-
Exclude system messages (defaults to
True) exclude_reasoningbool-
Exclude reasoning content (defaults to
False). exclude_tool_usagebool-
Exclude tool usage (defaults to
False)
MessagesPreprocessor
ChatMessage preprocessing transformations.
Provide a transform function for fully custom transformations. Use the higher-level options (e.g. exclude_system) to perform various common content removal transformations.
The default MessagesPreprocessor will exclude system messages and do no other transformations.
@dataclass(frozen=True)
class MessagesPreprocessor(MessageFormatOptions, Generic[T])Attributes
exclude_systembool-
Exclude system messages (defaults to
True) exclude_reasoningbool-
Exclude reasoning content (defaults to
False). exclude_tool_usagebool-
Exclude tool usage (defaults to
False) transformCallable[[T], Awaitable[list[ChatMessage]]] | None-
Transform the list of messages.
tool_callers
Build a mapping from tool_call_id to the assistant message that made the call.
This is useful for scanners that need to reference the assistant message that initiated a tool call, rather than the tool message itself.
def tool_callers(
transcript: Transcript,
) -> dict[str, tuple[ChatMessageAssistant, int]]transcriptTranscript-
The transcript containing all messages.
Types
MessageType
Message types.
MessageType = Literal["system", "user", "assistant", "tool"]EventType
Event types.
EventType = Literal[
"model",
"tool",
"compaction",
"approval",
"sandbox",
"info",
"store",
"logger",
"error",
"span_begin",
"span_end",
]TranscriptContent
Content filters for transcript loading.
Specifies which messages, events, and timeline data to include when loading transcript content for scanning.
@dataclass
class TranscriptContentAttributes
messagesMessageFilter-
Filter for which message types to include.
eventsEventFilter-
Filter for which event types to include.
timelineTimelineFilter-
Filter for which timeline events to include.
RefusalError
Error indicating that the model refused a scan request.
class RefusalError(RuntimeError)Registration
scanner
Decorator for registering scanners.
def scanner(
factory: ScannerFactory[P, T] | None = None,
*,
loader: Loader[TScan] | None = None,
messages: list[MessageType] | Literal["all"] | None = None,
events: list[EventType] | Literal["all"] | None = None,
timeline: Literal[True] | list[EventType] | Literal["all"] | None = None,
name: str | None = None,
version: int = 0,
metrics: Sequence[Metric | Mapping[str, Sequence[Metric]]]
| Mapping[str, Sequence[Metric]]
| None = None,
) -> (
ScannerFactory[P, T]
| Callable[[ScannerFactory[P, T]], ScannerFactory[P, T]]
| Callable[[ScannerFactory[P, TScan]], ScannerFactory[P, TScan]]
| Callable[[ScannerFactory[P, TM]], ScannerFactory[P, ScannerInput]]
| Callable[[ScannerFactory[P, TE]], ScannerFactory[P, ScannerInput]]
)factoryScannerFactory[P, T] | None-
Decorated scanner function.
loaderLoader[TScan] | None-
Custom data loader for scanner.
messageslist[MessageType] | Literal['all'] | None-
Message types to scan.
eventslist[EventType] | Literal['all'] | None-
Event types to scan.
timelineLiteral[True] | list[EventType] | Literal['all'] | None-
Event types to include in timelines.
namestr | None-
Scanner name (defaults to function name).
versionint-
Scanner version (defaults to 0).
metricsSequence[Metric | Mapping[str, Sequence[Metric]]] | Mapping[str, Sequence[Metric]] | None-
One or more metrics to calculate over the values (only used if scanner is converted to a scorer via as_scorer()).
loader
Decorator for registering loaders.
def loader(
*,
name: str | None = None,
messages: list[MessageType] | Literal["all"] | None = None,
events: list[EventType] | Literal["all"] | None = None,
timeline: Literal[True] | list[EventType] | Literal["all"] | None = None,
content: TranscriptContent | None = None,
) -> Callable[[LoaderFactory[P, TLoaderResult]], LoaderFactory[P, TLoaderResult]]namestr | None-
Loader name (defaults to function name).
messageslist[MessageType] | Literal['all'] | None-
Message types to load from.
eventslist[EventType] | Literal['all'] | None-
Event types to load from.
timelineLiteral[True] | list[EventType] | Literal['all'] | None-
Event types to include in timelines.
contentTranscriptContent | None-
Transcript content filter.
as_scorer
Convert a Scanner to an Inspect Scorer.
def as_scorer(
scanner: Scanner[Transcript],
metrics: Sequence[Metric | Mapping[str, Sequence[Metric]]]
| Mapping[str, Sequence[Metric]]
| None = None,
) -> ScorerscannerScanner[Transcript]-
Scanner to convert (must take a Transcript).
metricsSequence[Metric | Mapping[str, Sequence[Metric]]] | Mapping[str, Sequence[Metric]] | None-
Metrics for scorer. Defaults to
metricsspecified on the@scannerdecorator (or[mean(), stderr()]if none were specified).