Transcript API
Reading
transcripts_from
Read transcripts for scanning.
Transcripts may be stored in a TranscriptDB or may be Inspect eval logs.
def transcripts_from(location: str | Logs) -> Transcriptslocationstr | Logs-
Transcripts location. Either a path to a transcript database or path(s) to Inspect eval logs.
Transcript
Transcript info and transcript content (messages and events).
class Transcript(TranscriptInfo)Attributes
messageslist[ChatMessage]-
Main message thread.
eventslist[Event]-
Events from transcript.
Transcripts
Collection of transcripts for scanning.
Transcript collections can be filtered using the where(), limit(), and ’shuffle()` methods. The transcripts are not modified in place so the filtered transcripts should be referenced via the return value. For example:
from inspect_scout import transcripts, columns as c
transcripts = transcripts_from("./logs")
transcripts = transcripts.where(c.task_set == "cybench")class Transcripts(abc.ABC)Methods
- where
-
Filter the transcript collection by a Condition.
def where(self, condition: Condition) -> "Transcripts"conditionCondition-
Filter condition.
- for_validation
-
Filter transcripts to only those with IDs matching validation cases.
def for_validation( self, validation: ValidationSet | dict[str, ValidationSet] ) -> "Transcripts"validationValidationSet | dict[str, ValidationSet]-
Validation object containing cases with target IDs.
- limit
-
Limit the number of transcripts processed.
def limit(self, n: int) -> "Transcripts"nint-
Limit on transcripts.
- shuffle
-
Shuffle the order of transcripts.
def shuffle(self, seed: int | None = None) -> "Transcripts"seedint | None-
Random seed for shuffling.
- reader
-
Read the selected transcripts.
@abc.abstractmethod def reader(self, snapshot: ScanTranscripts | None = None) -> TranscriptsReadersnapshotScanTranscripts | None-
An optional snapshot which provides hints to make the reader more efficient (e.g. by preventing a full scan to find transcript_id => filename mappings)
- from_snapshot
-
Restore transcripts from a snapshot.
@staticmethod @abc.abstractmethod def from_snapshot(snapshot: ScanTranscripts) -> "Transcripts"snapshotScanTranscripts
TranscriptsReader
Read transcripts based on a TranscriptsQuery.
class TranscriptsReader(abc.ABC)Methods
- index
-
Index of
TranscriptInfofor the collection.@abc.abstractmethod def index(self) -> AsyncIterator[TranscriptInfo] - read
-
Read transcript content.
@abc.abstractmethod async def read( self, transcript: TranscriptInfo, content: TranscriptContent ) -> TranscripttranscriptTranscriptInfo-
Transcript to read.
contentTranscriptContent-
Content to read (e.g. specific message types, etc.)
Database
transcripts_db
Read/write interface to transcripts database.
def transcripts_db(location: str) -> TranscriptsDBlocationstr-
Database location (e.g. directory or S3 bucket).
TranscriptsDB
Database of transcripts.
class TranscriptsDB(abc.ABC)Methods
- __init__
-
Create a transcripts database.
def __init__(self, location: str, where: list[Condition] | None = None) -> Nonelocationstr-
Database location (e.g. local or S3 file path)
wherelist[Condition] | None-
Optional list of conditions used to filter transcripts.
- connect
-
Connect to transcripts database.
@abc.abstractmethod async def connect(self) -> None - disconnect
-
Disconnect to transcripts database.
@abc.abstractmethod async def disconnect(self) -> None - insert
-
Insert transcripts into database.
@abc.abstractmethod async def insert( self, transcripts: Iterable[Transcript] | AsyncIterable[Transcript] | Transcripts | TranscriptsSource | pa.RecordBatchReader, ) -> NonetranscriptsIterable[Transcript] | AsyncIterable[Transcript] | Transcripts | TranscriptsSource | pa.RecordBatchReader-
Transcripts to insert (iterable, async iterable, or source).
- transcript_ids
-
Get transcript IDs matching conditions.
Optimized method that returns only transcript IDs without loading full metadata. Default implementation uses select(), but subclasses can override for better performance.
@abc.abstractmethod async def transcript_ids( self, where: list[Condition] | None = None, limit: int | None = None, shuffle: bool | int = False, ) -> dict[str, str | None]wherelist[Condition] | None-
Condition(s) to filter by.
limitint | None-
Maximum number to return.
shufflebool | int-
Randomly shuffle results (pass
intfor reproducible seed).
- select
-
Select transcripts matching a condition.
@abc.abstractmethod def select( self, where: list[Condition] | None = None, limit: int | None = None, shuffle: bool | int = False, ) -> AsyncIterator[TranscriptInfo]wherelist[Condition] | None-
Condition(s) to select for.
limitint | None-
Maximum number to select.
shufflebool | int-
Randomly shuffle transcripts selected (pass
intfor reproducible seed).
- read
-
Read transcript content.
@abc.abstractmethod async def read(self, t: TranscriptInfo, content: TranscriptContent) -> TranscripttTranscriptInfo-
Transcript to read.
contentTranscriptContent-
Content to read (messages, events, etc.)
TranscriptsSource
Async iterator of transcripts.
class TranscriptsSource(Protocol):
def __call__(self) -> AsyncIterator[Transcript]Filtering
Column
Database column with comparison operators.
Supports various predicate functions including like(), not_like(), between(), etc. Additionally supports standard python equality and comparison operators (e.g. ==, ’>`, etc.
class ColumnMethods
- in_
-
Check if value is in a list.
def in_(self, values: list[Any]) -> Conditionvalueslist[Any]
- not_in
-
Check if value is not in a list.
def not_in(self, values: list[Any]) -> Conditionvalueslist[Any]
- like
-
SQL LIKE pattern matching (case-sensitive).
def like(self, pattern: str) -> Conditionpatternstr
- not_like
-
SQL NOT LIKE pattern matching (case-sensitive).
def not_like(self, pattern: str) -> Conditionpatternstr
- ilike
-
PostgreSQL ILIKE pattern matching (case-insensitive).
Note: For SQLite and DuckDB, this will use LIKE with LOWER() for case-insensitivity.
def ilike(self, pattern: str) -> Conditionpatternstr
- not_ilike
-
PostgreSQL NOT ILIKE pattern matching (case-insensitive).
Note: For SQLite and DuckDB, this will use NOT LIKE with LOWER() for case-insensitivity.
def not_ilike(self, pattern: str) -> Conditionpatternstr
- is_null
-
Check if value is NULL.
def is_null(self) -> Condition - is_not_null
-
Check if value is not NULL.
def is_not_null(self) -> Condition - between
-
Check if value is between two values.
def between(self, low: Any, high: Any) -> ConditionlowAny-
Lower bound (inclusive). If None, raises ValueError.
highAny-
Upper bound (inclusive). If None, raises ValueError.
- not_between
-
Check if value is not between two values.
def not_between(self, low: Any, high: Any) -> ConditionlowAny-
Lower bound (inclusive). If None, raises ValueError.
highAny-
Upper bound (inclusive). If None, raises ValueError.
Condition
WHERE clause condition that can be combined with others.
class Condition(BaseModel)Attributes
leftUnion[str, 'Condition', None]-
Column name (simple) or left operand (compound).
operatorUnion[Operator, LogicalOperator, None]-
Comparison operator (simple) or logical operator (compound).
rightUnion['Condition', list[ScalarValue], tuple[ScalarValue, ScalarValue], ScalarValue]-
Comparison value (simple) or right operand (compound).
is_compoundbool-
True for AND/OR/NOT conditions, False for simple comparisons.
paramslist[ScalarValue]-
SQL parameters extracted from the condition for parameterized queries.
Methods
- to_sql
-
Generate SQL WHERE clause and parameters.
def to_sql( self, dialect: Union[ SQLDialect, Literal["sqlite", "duckdb", "postgres"] ] = SQLDialect.SQLITE, ) -> tuple[str, list[Any]]dialectUnion[SQLDialect, Literal['sqlite', 'duckdb', 'postgres']]-
Target SQL dialect (sqlite, duckdb, or postgres).
Columns
Entry point for building filter expressions.
Supports both dot notation and bracket notation for accessing columns:
from inspect_scout import columns as c
c.column_name
c["column_name"]
c["nested.json.path"]class ColumnsAttributes
transcript_idColumn-
Globally unique identifier for transcript.
source_typeColumn-
Type of transcript source (e.g. “eval_log”, “weave”, etc.).
source_idColumn-
Globally unique identifier of transcript source (e.g. ‘eval_id’ in Inspect logs).
source_uriColumn-
URI for source data (e.g. full path to the Inspect log file or weave op).
dateColumn-
Date transcript was created.
task_setColumn-
Set from which transcript task was drawn (e.g. benchmark name).
task_idColumn-
Identifier for task (e.g. dataset sample id).
task_repeatColumn-
Repeat for a given task id within a task set (e.g. epoch).
agentColumn-
Agent name.
agent_argsColumn-
Agent args.
modelColumn-
Model used for eval.
model_optionsColumn-
Generation options for model.
scoreColumn-
Headline score value.
successColumn-
Reduction of ‘score’ to True/False sucess.
total_timeColumn-
Total execution time.
errorColumn-
Error that halted exeuction.
limitColumn-
Limit that halted execution.
columns
Column selector for where expressions.
Typically aliased to a more compact expression (e.g. c) for use in queries). For example:
from inspect_scout import columns as c
filter = c.model == "gpt-4"
filter = (c.task_set == "math") & (c.epochs > 1)columns = Columns()LogColumns
Typed column interface for Inspect log transcripts.
Provides typed properties for standard Inspect log columns while preserving the ability to access custom fields through the base Metadata class methods.
class LogColumns(Columns)Attributes
sample_idColumn-
Unique id for sample.
eval_idColumn-
Globally unique id for eval.
eval_statusColumn-
Status of eval.
logColumn-
Location that the log file was read from.
eval_tagsColumn-
Tags associated with evaluation run.
eval_metadataColumn-
Additional eval metadata.
task_argsColumn-
Task arguments.
generate_configColumn-
Generate config specified for model instance.
model_rolesColumn-
Model roles.
idColumn-
Unique id for sample.
epochColumn-
Epoch number for sample.
inputColumn-
Sample input.
targetColumn-
Sample target.
sample_metadataColumn-
Sample metadata.
working_timeColumn-
Time spent working (model generation, sandbox calls, etc.).
log_columns
Log columns selector for where expressions.
Typically aliased to a more compact expression (e.g. c) for use in queries). For example:
from inspect_scout import log_columns as c
# typed access to standard fields
filter = c.model == "gpt-4"
filter = (c.task_set == "math") & (c.epochs > 1)
# dynamic access to custom fields
filter = c["custom_field"] > 100log_columns = LogColumns()