Transcript API

Reading

transcripts_from

Read transcripts for scanning.

Transcripts may be stored in a TranscriptDB or may be Inspect eval logs.

def transcripts_from(location: str | Logs) -> Transcripts

location str | Logs: Transcripts location. Either a path to a transcript database or path(s) to Inspect eval logs.

Transcript

Transcript info and transcript content (messages and events).

Source

class Transcript(TranscriptInfo)

Attributes

messages list[ChatMessage]: Main message thread.
events list[Event]: Events from transcript.

Transcripts

Collection of transcripts for scanning.

Transcript collections can be filtered using the where(), limit(), and ’shuffle()` methods. The transcripts are not modified in place so the filtered transcripts should be referenced via the return value. For example:

from inspect_scout import transcripts, columns as c

transcripts = transcripts_from("./logs")
transcripts = transcripts.where(c.task_set == "cybench")

Source

class Transcripts(abc.ABC)

Methods

where

Filter the transcript collection by a Condition.

Source

def where(self, condition: Condition) -> "Transcripts"

condition Condition: Filter condition.

for_validation

Filter transcripts to only those with IDs matching validation cases.

Source

def for_validation(
    self, validation: ValidationSet | dict[str, ValidationSet]
) -> "Transcripts"

validation ValidationSet | dict[str, ValidationSet]: Validation object containing cases with target IDs.

limit

Limit the number of transcripts processed.

Source

def limit(self, n: int) -> "Transcripts"

n int: Limit on transcripts.

shuffle

Shuffle the order of transcripts.

Source

def shuffle(self, seed: int | None = None) -> "Transcripts"

seed int | None: Random seed for shuffling.

reader

Read the selected transcripts.

Source

@abc.abstractmethod
def reader(self, snapshot: ScanTranscripts | None = None) -> TranscriptsReader

snapshot ScanTranscripts | None: An optional snapshot which provides hints to make the reader more efficient (e.g. by preventing a full scan to find transcript_id => filename mappings)

from_snapshot

Restore transcripts from a snapshot.

Source

@staticmethod
@abc.abstractmethod
def from_snapshot(snapshot: ScanTranscripts) -> "Transcripts"

snapshot ScanTranscripts

TranscriptsReader

Read transcripts based on a TranscriptsQuery.

Source

class TranscriptsReader(abc.ABC)

Methods

index

Index of TranscriptInfo for the collection.

Source

@abc.abstractmethod
def index(self) -> AsyncIterator[TranscriptInfo]

read

Read transcript content.

Source

@abc.abstractmethod
async def read(
    self, transcript: TranscriptInfo, content: TranscriptContent
) -> Transcript

transcript TranscriptInfo: Transcript to read.
content TranscriptContent: Content to read (e.g. specific message types, etc.)

Database

transcripts_db

Read/write interface to transcripts database.

Source

def transcripts_db(location: str) -> TranscriptsDB

location str: Database location (e.g. directory or S3 bucket).

TranscriptsDB

Database of transcripts.

Source

class TranscriptsDB(abc.ABC)

Methods

__init__

Create a transcripts database.

Source

def __init__(self, location: str, where: list[Condition] | None = None) -> None

location str: Database location (e.g. local or S3 file path)
where list[Condition] | None: Optional list of conditions used to filter transcripts.

connect

Connect to transcripts database.

Source

@abc.abstractmethod
async def connect(self) -> None

disconnect

Disconnect to transcripts database.

Source

@abc.abstractmethod
async def disconnect(self) -> None

insert

Insert transcripts into database.

Source

@abc.abstractmethod
async def insert(
    self,
    transcripts: Iterable[Transcript]
    | AsyncIterable[Transcript]
    | Transcripts
    | TranscriptsSource
    | pa.RecordBatchReader,
) -> None

transcripts Iterable[Transcript] | AsyncIterable[Transcript] | Transcripts | TranscriptsSource | pa.RecordBatchReader: Transcripts to insert (iterable, async iterable, or source).

transcript_ids

Get transcript IDs matching conditions.

Optimized method that returns only transcript IDs without loading full metadata. Default implementation uses select(), but subclasses can override for better performance.

Source

@abc.abstractmethod
async def transcript_ids(
    self,
    where: list[Condition] | None = None,
    limit: int | None = None,
    shuffle: bool | int = False,
) -> dict[str, str | None]

where list[Condition] | None: Condition(s) to filter by.
limit int | None: Maximum number to return.
shuffle bool | int: Randomly shuffle results (pass int for reproducible seed).

select

Select transcripts matching a condition.

Source

@abc.abstractmethod
def select(
    self,
    where: list[Condition] | None = None,
    limit: int | None = None,
    shuffle: bool | int = False,
) -> AsyncIterator[TranscriptInfo]

where list[Condition] | None: Condition(s) to select for.
limit int | None: Maximum number to select.
shuffle bool | int: Randomly shuffle transcripts selected (pass int for reproducible seed).

read

Read transcript content.

Source

@abc.abstractmethod
async def read(self, t: TranscriptInfo, content: TranscriptContent) -> Transcript

t TranscriptInfo: Transcript to read.
content TranscriptContent: Content to read (messages, events, etc.)

TranscriptsSource

Async iterator of transcripts.

Source

class TranscriptsSource(Protocol):
    def __call__(self) -> AsyncIterator[Transcript]

Filtering

Column

Database column with comparison operators.

Supports various predicate functions including like(), not_like(), between(), etc. Additionally supports standard python equality and comparison operators (e.g. ==, ’>`, etc.

Source

class Column

Methods

in_

Check if value is in a list.

Source

def in_(self, values: list[Any]) -> Condition

values list[Any]

not_in

Check if value is not in a list.

Source

def not_in(self, values: list[Any]) -> Condition

values list[Any]

like

SQL LIKE pattern matching (case-sensitive).

Source

def like(self, pattern: str) -> Condition

pattern str

not_like

SQL NOT LIKE pattern matching (case-sensitive).

Source

def not_like(self, pattern: str) -> Condition

pattern str

ilike

PostgreSQL ILIKE pattern matching (case-insensitive).

Note: For SQLite and DuckDB, this will use LIKE with LOWER() for case-insensitivity.

Source

def ilike(self, pattern: str) -> Condition

pattern str

not_ilike

PostgreSQL NOT ILIKE pattern matching (case-insensitive).

Note: For SQLite and DuckDB, this will use NOT LIKE with LOWER() for case-insensitivity.

Source

def not_ilike(self, pattern: str) -> Condition

pattern str

is_null

Check if value is NULL.

Source

def is_null(self) -> Condition

is_not_null

Check if value is not NULL.

Source

def is_not_null(self) -> Condition

between

Check if value is between two values.

Source

def between(self, low: Any, high: Any) -> Condition

low Any: Lower bound (inclusive). If None, raises ValueError.
high Any: Upper bound (inclusive). If None, raises ValueError.

not_between

Check if value is not between two values.

Source

def not_between(self, low: Any, high: Any) -> Condition

low Any: Lower bound (inclusive). If None, raises ValueError.
high Any: Upper bound (inclusive). If None, raises ValueError.

Condition

WHERE clause condition that can be combined with others.

Source

class Condition(BaseModel)

Attributes

left Union[str, 'Condition', None]: Column name (simple) or left operand (compound).
operator Union[Operator, LogicalOperator, None]: Comparison operator (simple) or logical operator (compound).
right Union['Condition', list[ScalarValue], tuple[ScalarValue, ScalarValue], ScalarValue]: Comparison value (simple) or right operand (compound).
is_compound bool: True for AND/OR/NOT conditions, False for simple comparisons.
params list[ScalarValue]: SQL parameters extracted from the condition for parameterized queries.

Methods

to_sql

Generate SQL WHERE clause and parameters.

Source

def to_sql(
    self,
    dialect: Union[
        SQLDialect, Literal["sqlite", "duckdb", "postgres"]
    ] = SQLDialect.SQLITE,
) -> tuple[str, list[Any]]

dialect Union[SQLDialect, Literal['sqlite', 'duckdb', 'postgres']]: Target SQL dialect (sqlite, duckdb, or postgres).

Columns

Entry point for building filter expressions.

Supports both dot notation and bracket notation for accessing columns:

from inspect_scout import columns as c

c.column_name
c["column_name"]
c["nested.json.path"]

Source

class Columns

Attributes

transcript_id Column: Globally unique identifier for transcript.
source_type Column: Type of transcript source (e.g. “eval_log”, “weave”, etc.).
source_id Column: Globally unique identifier of transcript source (e.g. ‘eval_id’ in Inspect logs).
source_uri Column: URI for source data (e.g. full path to the Inspect log file or weave op).
date Column: Date transcript was created.
task_set Column: Set from which transcript task was drawn (e.g. benchmark name).
task_id Column: Identifier for task (e.g. dataset sample id).
task_repeat Column: Repeat for a given task id within a task set (e.g. epoch).
agent Column: Agent name.
agent_args Column: Agent args.
model Column: Model used for eval.
model_options Column: Generation options for model.
score Column: Headline score value.
success Column: Reduction of ‘score’ to True/False sucess.
total_time Column: Total execution time.
error Column: Error that halted exeuction.
limit Column: Limit that halted execution.

columns

Column selector for where expressions.

Typically aliased to a more compact expression (e.g. c) for use in queries). For example:

from inspect_scout import columns as c
filter = c.model == "gpt-4"
filter = (c.task_set == "math") & (c.epochs > 1)

Source

columns = Columns()

LogColumns

Typed column interface for Inspect log transcripts.

Provides typed properties for standard Inspect log columns while preserving the ability to access custom fields through the base Metadata class methods.

Source

class LogColumns(Columns)

Attributes

sample_id Column: Unique id for sample.
eval_id Column: Globally unique id for eval.
eval_status Column: Status of eval.
log Column: Location that the log file was read from.
eval_tags Column: Tags associated with evaluation run.
eval_metadata Column: Additional eval metadata.
task_args Column: Task arguments.
generate_config Column: Generate config specified for model instance.
model_roles Column: Model roles.
id Column: Unique id for sample.
epoch Column: Epoch number for sample.
input Column: Sample input.
target Column: Sample target.
sample_metadata Column: Sample metadata.
working_time Column: Time spent working (model generation, sandbox calls, etc.).

log_columns

Log columns selector for where expressions.

Typically aliased to a more compact expression (e.g. c) for use in queries). For example:

from inspect_scout import log_columns as c

# typed access to standard fields
filter = c.model == "gpt-4"
filter = (c.task_set == "math") & (c.epochs > 1)

# dynamic access to custom fields
filter = c["custom_field"] > 100

Source

log_columns = LogColumns()