Scanners

Overview

Scanners are the main unit of processing in Inspect Scout and can target a wide variety of content types. In this article we’ll cover the basic scanning concepts, and then drill into creating scanners that target various types (Transcript, ChatMessage, or Event) as well as creating custom loaders which enable scanning of lists of events or messages.

This article goes in depth on custom scanner development. If you are looking for a straightforward high-level way to create an LLM-based scanner see the LLM Scanner documentation.

Notet that you can also use scanners directly as Inspect scorers (see Scanners as Scorers for details).

Scanner Basics

A Scanner is a function that takes a ScannerInput (typically a Transcript, but possibly an Event, ChatMessage, or list of events or messages) and returns a Result.

The result includes a value which can be of any type—this might be True to indicate that something was found but might equally be a number to indicate a count. More elaborate scanner values (dict or list) are also possible.

Here is a simple scanner that uses a model to look for agent “confusion”—whether or not it finds confusion, it still returns the model completion as an explanation:

@scanner(messages="all")
def confusion() -> Scanner[Transcript]:
    
    async def scan(transcript: Transcript) -> Result:

        # call model
        output = await get_model().generate(
            "Here is a transcript of an LLM agent " +
            "solving a puzzle:\n\n" +
            "===================================" +
            await messages_as_str(transcript) +
            "===================================\n\n" +
            "In the transcript above do you see the " +
            "agent becoming confused? Respond " +
            "beginning with 'Yes' or 'No', followed " +
            "by an explanation."
        )

        # extract the first word
        match = re.match(r"^\w+", output.completion.strip())

        # return result
        if match:
            answer = match.group(0)
            return Result(
                value=answer.lower() == "yes",
                answer=answer,
                explanation=output.completion,
            )
        else:
            return Result(value=False, explanation=output.completion)

    return scan

This scanner illustrates some of the lower-level mechanics of building custom scanners. You can also use the higher level llm_scanner() to implement this in far fewer lines of code:

from inspect_scout import Transcript, llm_scanner, scanner

@scanner(messages="all")
def confusion() -> Scanner[Transcript]:
    return llm_scanner(
        question="In the transcript above do you see " +
            "the agent becoming confused?"
        answer="boolean"
    )

Input Types

Transcript is the most common ScannerInput however several other types are possible:

  • Event — Single event from the transcript (e.g. ModelEvent, ToolEvent, etc.).

  • ChatMessage — Single chat message from the transcript message history.

  • list[Event] or list[ChatMessage] — Arbitrary sets of events or messages extracted from the Transcript (see Loaders below for details).

See the sections on Transcripts, Events, Messages, and Loaders below for additional details on handling various input types.

Input Filtering

One important principle of the Inspect Scout transcript pipeline is that only the precise data to be scanned should be read, and nothing more. This can dramatically improve performance as messages and events that won’t be seen by scanners are never deserialized. Scanner input filters are specified as arguments to the @scanner decorator (you may have noticed the messages="all" attached to the scanner decorator in the example above).

For example, here we are looking for instances of assistants swearing—for this task we only need to look at assistant messages so we specify messages=["assistant"]

@scanner(messages=["assistant"])
def assistant_swearing() -> Scanner[Transcript]:

    async def scan(transcript: Transcript) -> Result:
        swear_words = [
            word 
            for m in transcript.messages 
            for word in extract_swear_words(m.text)
        ]
        return Result(
            value=len(swear_words),
            explanation=",".join(swear_words)
        )

    return scan

With this filter, only assistant messages (and no events at all) will be loaded from transcripts during scanning.

Note that by default, no filters are active, so if you don’t specify values for messages and/or events your scanner will not be called!

Transcripts

Transcripts are the most common input to scanners. If you are reading from Inspect eval logs, each log will have samples * epochs transcripts. If you are reading from another source, each agent trace will yield a single Transcript.

Transcript Fields

Here are the available Transcript fields:

Field Type Description
transcript_id str Globally unique identifier for a transcript (maps to EvalSample.uuid in Inspect logs).
source_type str Type of transcript source (e.g. “eval_log”, “weave”, etc.).
source_id str Globally unique identifier for a transcript source (maps to eval_id in Inspect logs)
source_uri str URI for source data (e.g. full path to the Inspect log file).
date iso Date/time when the transcript was created.
task_set str Set from which transcript task was drawn (e.g. Inspect task name or benchmark name)
task_id str Identifier for task (e.g. dataset sample id).
task_repeat int Repeat for a given task id within a task set (e.g. epoch).
agent str Agent used to to execute task.
agent_args dict
JSON
Arguments passed to create agent.
model str Main model used by agent.
model_options dict
JSON
Generation options for main model.
score JsonValue
JSON
Value indicating score on task.
success bool Boolean reduction of score to succeeded/failed.
total_time number Time required to execute task (seconds)
total_tokens number Tokens spent in execution of task.
error str Error message that terminated the task.
limit str Limit that caused the task to exit (e.g. “tokens”, “messages, etc.)
metadata dict[str, JsonValue] Transcript source specific metadata (e.g. model, task name, errors, epoch, dataset sample id, limits, etc.).
messages list[ChatMessage] Message history.
events list[Event] Event history (e.g. model events, tool events, etc.)

Content Filtering

Note that the messages and events fields will not be populated unless you specify a messages or events filter on your scanner. For example, this scanner will see all messages and events:

@scanner(messages="all", events="all")
def my_scanner() -> Scanner[Transcript]: ...

This scanner will see only model and tool events:

@scanner(events=["model", "tool"])
def my_scanner() -> Scanner[Transcript]: ...

This scanner will see only assistant messages:

@scanner(messages=["assistant"])
def my_scanner() -> Scanner[Transcript]: ...

Presenting Messages

When processing transcripts, you will often want to present an entire message history to model for analysis. Above, we used the messages_as_str() function to do this:

# call model
result = await get_model().generate(
    "Here is a transcript of an LLM agent " +
    "solving a puzzle:\n\n" +
    "===================================" +
    await messages_as_str(transcript) +
    "===================================\n\n" +
    "In the transcript above do you see the agent " +
    "becoming confused? Respond beginning with 'Yes' " +
    "or 'No', followed by an explanation."
)

The messages_as_str() function takes a Transcript | list[ChatMessage] and will by default remove system messages from the message list. See MessagesPreprocessor for other available options.

Multiple Results

Scanners can return multiple results as a list. For example:

return [
    Result(label="deception", value=True, explanation="..."),
    Result(label="misconfiguration", value=True, explanation="...")
]

This is useful when a scanner is capable of making several types of observation. In this case it’s also important to indicate the origin of the result (i.e. which class of observation is is), which you can do using the label field (note that label can repeat multiple times in a set, so e.g. you could have multiple results with label="deception").

When a list is returned, each individual result will yield its own row in the results data frame.

When validating scanners that return lists of results, you can use result set validation to specify expected values for each label independently.

Event Scanners

To write a scanner that targets events, write a function that takes the event type(s) you want to process. For example, this scanner will see only model events:

@scanner
def my_scanner() -> Scanner[ModelEvent]:
    def scan(event: ModelEvent) -> Result: 
        ...

    return scan

Note that the events="model" filter was not required since we had already declared our scanner to take only model events. If we wanted to take both model and tool events we’d do this:

@scanner
def my_scanner() -> Scanner[ModelEvent | ToolEvent]:
    def scan(event: ModelEvent | ToolEvent) -> Result: 
        ...

    return scan

Message Scanners

To write a scanner that targets messages, write a function that takes the message type(s) you want to process. For example, this scanner will only see tool messages:

@scanner
def my_scanner() -> Scanner[ChatMessageTool]:
    def scan(message: ChatMessageTool) -> Result: 
        ...

    return scan

This scanner will see only user and assistant messages:

@scanner
def my_scanner() -> Scanner[ChatMessageUser | ChatMessageAssistant]:
    def scan(message: ChatMessageUser | ChatMessageAssistant) -> Result: 
        ...

    return scan

Scanner Metrics

You can add metrics to scanners to aggregate result values. Metrics are computed during scanning and available as part of the scan results. For example:

from inspect_ai.scorer import mean

@scanner(messages="all", metrics=[mean()])
def efficiency() -> Scanner[Transcript]:
    return llm_scanner(
        question="On a scale of 1 to 10, how efficiently did the assistant perform?",
        answer="numeric",
    )

Note that we import the mean metric from inspect_ai. You can use any standard Inspect metric or create custom metrics, and can optionally include more than one metric (e.g. stderr).

See the Inspect documentation on Built in Metrics and Custom Metrics for additional details.

Result Sets

If your scanner yields multiple results you can still use it as a scorer, but you will want to provide a dictionary of metrics corresponding to the labels used by your results. For example, if you have a scanner that can yield results with label="deception" or label="misconfiguration", you might declare your metrics like this:

@scanner(messages="all", metrics=[{ "deception": [mean(), stderr()], "misconfiguration": [mean(), stderr()] }])
def my_scanner() -> Scanner[Transcript]: ...

Or you can use a glob (*) to use the same metrics for all labels:

@scanner(messages="all", metrics=[{ "*": [mean(), stderr()] }])
def my_scanner() -> Scanner[Transcript]: ...

You should also be sure to return a result for each supported label (so that metrics can be computed correctly on each row).

Scanners as Scorers

You may have noticed that scanners are very similar to Inspect Scorers. This is by design, and it is actually possible to use scanners directly as Inspect scorers.

For example, for the confusion() scorer we implemented above:

@scanner(messages="all")
def confusion() -> Scanner[Transcript]:
    
    async def scan(transcript: Transcript) -> Result:

        # model call eluded for brevity
        output = get_model(...)

        # extract the first word
        match = re.match(r"^\w+", output.completion.strip())

        # return result
        if match:
            answer = match.group(0)
            return Result(
                value=answer.lower() == "yes",
                answer=answer,
                explanation=output.completion,
            )
        else:
            return Result(value=False, explanation=output.completion)

    return scan

We can use this directly in an Inspect Task as follows:

from .scanners import confusion

@task
def mytask():
    return Task(
        ...,
        scorer = confusion()
    )

We can also use it with the inspect score command:

inspect score --scorer scanners.py@confusion logfile.eval

Metrics

The metrics used for the scorer will default to mean() and stderr()—however, you can also explicitly specify metrics on the @scanner decorator:

@scanner(messages="all", metrics=[mean(), bootstrap_stderr()])
def confusion() -> Scanner[Transcript]: ...

If you are interfacing with code that expects only Scorer instances, you can also use the as_scorer() function from Inspect Scout to explicitly convert your scanner to a scorer:

from inspect_ai import eval
from inspect_scout import as_scorer

from .mytasks import ctf_task
from .scanners import confusion

eval(ctf_task(scorer=as_scorer(confusion())))

If your scanner yields multiple results see the discussion above on Result Sets for details on how to specify metrics for this case.

Custom Loaders

When you want to process multiple discrete items from a Transcript this might not always fall neatly into single messages or events. For example, you might want to process pairs of user/assistant messages. To do this, create a custom Loader that yields the content as required.

For example, here is a Loader that yields user/assistant message pairs:

@loader(messages=["user", "assistant"])
def conversation_turns():
    async def load(
        transcript: Transcript
    ) -> AsyncIterator[list[ChatMessage], None]:
        
        for user,assistant in message_pairs(transcript.messages):
            yield [user, assistant]

    return load

Note that just like with scanners, the loader still needs to provide a messages=["user", "assistant"] in order to see those messages.

We can now use this loader in a scanner that looks for refusals:

@scanner(loader=conversation_turns())
def assistant_refusals() -> Scanner[list[ChatMessage]]:

    async def scan(messages: list[ChatMessage]) -> Result:
        user, assistant = messages
        return Result(
            value=is_refusal(assistant.text), 
            explanation=messages_as_str(messages)
        )

    return scan