Grep Scanner
Overview
The grep_scanner() provides pattern-based scanning of transcripts using grep-style matching. Unlike LLM Scanner which uses a language model for analysis, grep_scanner() performs fast, deterministic text pattern matching.
The grep scanner does simple string matching by default (literal patterns) and has an optional regex mode for complex patterns You can also search for multiple patterns with OR composition or alternatively labeled patterns for categorized results.
Basic Usage
Here is a simple example that finds occurrences of “error” in assistant messages:
from inspect_scout import Scanner, Transcript, grep_scanner, scanner
@scanner(messages=["assistant"])
def find_errors() -> Scanner[Transcript]:
return grep_scanner("error")The scanner returns a Result with:
value: Count of matches found (integer)explanation: Context snippets showing each matchreferences: Message references for match locations
Pattern Types
Single Pattern
The simplest form takes a single string pattern:
grep_scanner("error")Multiple Patterns
Pass a list to match any of multiple patterns (OR logic):
grep_scanner(["error", "failed", "exception"])All matches across all patterns are aggregated into a single count.
Labeled Patterns
Pass a dict to get separate results for each category:
@scanner(messages="all")
def categorized_issues() -> Scanner[Transcript]:
return grep_scanner({
"errors": ["error", "failed", "exception"],
"warnings": ["warning", "caution"],
"security": ["password", "secret", "token"],
})This returns a list[Result], one per label:
# Returns:
# - Result(label="errors", value=3, ...)
# - Result(label="warnings", value=1, ...)
# - Result(label="security", value=0, ...)Options
Case Sensitivity
By default, matching is case-insensitive (like grep -i):
grep_scanner("error") # Matches "error", "ERROR", "Error"For case-sensitive matching:
grep_scanner("error", ignore_case=False) # Only matches "error"Regex Mode
By default, patterns are treated as literal strings:
grep_scanner("file.*txt") # Matches literal "file.*txt"Enable regex mode for regular expression patterns:
grep_scanner(r"https?://\S+", regex=True) # Matches URLs
grep_scanner(r"\d{3}-\d{4}", regex=True) # Matches phone patternsWord Boundary
Match whole words only:
grep_scanner("error", word_boundary=True)
# Matches "error" but not "errorCode" or "myerror"This adds \b word boundary anchors around the pattern.
Scanner Results
Result Structure
For single/list patterns:
| Field | Type | Description |
|---|---|---|
value |
int |
Count of matches found |
explanation |
str \| None |
Context snippets for each match |
references |
list[Reference] |
Message/event citations |
For labeled patterns (dict), each label produces its own Result with the same structure, plus:
| Field | Type | Description |
|---|---|---|
label |
str |
The category label |
Context Snippets
The explanation field shows context around each match:
[M1]: ...request returned an **error** code 500...
[M2]: ...operation **error**: connection timeout...
[E1]: TOOL (search_files): Result: Found **error** in file.py...
- Up to 50 characters before/after each match
- Match text highlighted with
**bold** - Ellipsis (
...) when context is truncated - Reference prefix (
[M1],[M2]for messages,[E1],[E2]for events)
Searching Events
To search events (tool calls, errors, etc.), use the events parameter in the @scanner decorator:
Events Only
@scanner(events=["tool", "error"])
def find_tool_errors() -> Scanner[Transcript]:
return grep_scanner("timeout")Both Messages and Events
@scanner(messages="all", events=["tool", "error"])
def find_all_errors() -> Scanner[Transcript]:
return grep_scanner("error")The scanner automatically searches whatever is populated in the transcript:
- Messages are searched if
transcript.messagesis populated - Events are searched if
transcript.eventsis populated
References use [M1], [M2] for messages and [E1], [E2] for events.
Supported Event Types
| Event Type | What’s Searched |
|---|---|
model |
Model output completion text |
tool |
Function name, arguments, and result |
error |
Error message |
info |
Data field (JSON stringified if object) |
logger |
Log message text |
approval |
Message, tool call, and decision |
Examples
Finding Profanity
@scanner(messages=["assistant"])
def profanity_check() -> Scanner[Transcript]:
return grep_scanner(
["damn", "hell", "crap"],
word_boundary=True, # Avoid partial matches
)Detecting URLs
@scanner(messages="all")
def url_detection() -> Scanner[Transcript]:
return grep_scanner(
r"https?://[^\s<>\"']+",
regex=True,
)Multi-Category Analysis
@scanner(messages=["assistant"])
def content_analysis() -> Scanner[Transcript]:
return grep_scanner({
"code_snippets": [r"```", "def ", "class ", "function"],
"questions": [r"\?$"],
"commands": ["please", "could you", "can you"],
}, regex=True)Searching Tool Events
@scanner(events=["tool"])
def tool_failures() -> Scanner[Transcript]:
return grep_scanner(
["timeout", "connection refused", "permission denied"],
ignore_case=True,
)Messages and Events
@scanner(messages="all", events=["tool", "error"])
def comprehensive_error_check() -> Scanner[Transcript]:
return grep_scanner({
"errors": ["error", "exception", "failed"],
"warnings": ["warning", "deprecated"],
})