Database Schema

Overview

In a transcript database, the only strictly required field is transcript_id (although you’ll almost always want to also include a messages field as that’s the main thing targeted by most scanners).

Further, there are many standard fields (e.g. task, agent, model, score) which you’ll want to populate if you have access to them (as this will provide important context both when viewing transcripts and when viewing scan results). You can also include source_* fields as a reference to where the transcript originated,. Finally, arbitrary other fields can be included. All fields are queryable using the Transcripts API.

Field Type Description
transcript_id string Required. A globally unique identifier for a transcript.
source_type string Optional. Type of transcript source (e.g. “weave”, “logfire”, “eval_log”, etc.). Useful for providing a hint to readers about what might be available in the metadata field.
source_id string Optional. Globally unique identifier for a transcript source (e.g. a project id).
source_uri string Optional. URI for source data (e.g. link to a web page or REST resource for discovering more about the transcript).
date string Optional. ISO 8601 datetime of transcript creation.
task_set string Optional. Set from which transcript task was drawn (e.g. Inspect task name or benchmark name).
task_id string Optional. Identifier for task (e.g. dataset sample id).
task_repeat int Optional. Repeat for a given task id within a task set (e.g. epoch).
agent string Optional. Agent used to to execute task.
agent_args dict[str,Any]
JSON
Optional. Arguments passed to create agent.
model string Optional. Main model used by agent.
model_options dict[str,Any]
JSON
Optional. Generation options for main model.
score JsonValue
JSON
Optional. Value indicating score on task.
success bool Optional. Boolean reduction of score to succeeded/failed.
total_time float Optional. Time (in seconds) required to execute task.
total_tokens int Optional. Tokens spent in execution of task.
error string Optional. Error message that terminated the task.
limit string Optional. Limit that caused the task to exit (e.g. “tokens”, “messages, etc.)
messages list[ChatMessage]
JSON
Optional. List of ChatMessage with message history.
events list[Event]
JSON
Optional. List of Event with event history (e.g. model events, tool events, etc.)

Field types marked with JSON are stored in the database as serialized JSON strings and then converted to richer types when accessed via the Transcript interface.

Metadata

You can include arbitrary other fields in your database which will be made available as Transcript.metadata. These fields can then be used for filtering in calls to Transcripts.where().

Note that metadata columns are forwarded into the results database for scans (transcript_metadata) so it is generally a good practice to not include large amounts of data in these columns.

Messages

The messages field is a JSON encoded string of list[ChatMessage]. There are several helper functions available within the inspect_ai package to assist in converting from the raw message formats of various providers to the Inspect ChatMessage format:

Provider API Functions
OpenAI Chat Completions messages_from_openai()
model_output_from_openai()
OpenAI Responses messages_from_openai_responses()
model_output_from_openai_responses()
Anthropic Messages messages_from_anthropic()
model_output_from_anthropic()
Google Generate Content messages_from_google()
model_output_from_google()

For many straightforward transcripts the list of messages will be all that is needed for analysis.

Events

The events field is a JSON encoded string of list[Event]. Note that if your scanners deal entirely in messages rather than events (as a great many do) then it is not necessary to provide events.

Events are typically important when you are either analyzing complex multi-agent transcripts or doing very granular scanning for specific phenomena (e.g. tool calling errors).

While you can include any of the event types in defined in inspect_ai.event, there is a subset that is both likely to be of interest and that maps on to data provided by observability platforms and/or OTEL traces. These include:

Event Description
ModelEvent Generation call to a model.
ToolEvent Tool call made by a model.
ErrorEvent Runtime error aborting transcript.
SpanBeginEvent Mark the beginning of a transcript span (e.g. agent execution, tool call, custom block, etc.)
SpanEndEvent Mark the end of a transcript scan

Most observability systems will have some equivalent of the above in their traces. When reconstructing model events you will also likely want to use the helper functions mentioned above in Messages for converting raw model API payloads to ChatMessage.

NoteNot Required

The events field is only important if you have scanners that will be doing event analysis. Note that the default llm_scanner() provided within Scout looks only at messages not events.

Importing Data

Now that you understand the schema and have an idea for how you want to map your data into it, use one of the following methods to create the database:

  1. Transcript API: Read and parse transcripts into Transcript objects and use the TranscriptsDB.insert() function to add them to the database.

  2. Arrow Import: Read an existing set of transcripts stored in Arrow/Parquet and pass them to TranscriptsDB.insert() as a PyArrow RecordBatchReader.

  3. Parquet Data Lake: Point the TranscriptDB at an existing data lake (ensuring that the records adhere to the transcript database schema).

  4. Inspect Logs: Import Inspect AI eval logs from a log directory.