Database Schema
Overview
In a transcript database, the only strictly required field is transcript_id (although you’ll almost always want to also include a messages field as that’s the main thing targeted by most scanners).
Further, there are many standard fields (e.g. task, agent, model, score) which you’ll want to populate if you have access to them (as this will provide important context both when viewing transcripts and when viewing scan results). You can also include source_* fields as a reference to where the transcript originated,. Finally, arbitrary other fields can be included. All fields are queryable using the Transcripts API.
| Field | Type | Description |
|---|---|---|
transcript_id |
string | Required. A globally unique identifier for a transcript. |
source_type |
string | Optional. Type of transcript source (e.g. “weave”, “logfire”, “eval_log”, etc.). Useful for providing a hint to readers about what might be available in the metadata field. |
source_id |
string | Optional. Globally unique identifier for a transcript source (e.g. a project id). |
source_uri |
string | Optional. URI for source data (e.g. link to a web page or REST resource for discovering more about the transcript). |
date |
string | Optional. ISO 8601 datetime of transcript creation. |
task_set |
string | Optional. Set from which transcript task was drawn (e.g. Inspect task name or benchmark name). |
task_id |
string | Optional. Identifier for task (e.g. dataset sample id). |
task_repeat |
int | Optional. Repeat for a given task id within a task set (e.g. epoch). |
agent |
string | Optional. Agent used to to execute task. |
agent_args |
dict[str,Any] JSON |
Optional. Arguments passed to create agent. |
model |
string | Optional. Main model used by agent. |
model_options |
dict[str,Any] JSON |
Optional. Generation options for main model. |
score |
JsonValue JSON |
Optional. Value indicating score on task. |
success |
bool | Optional. Boolean reduction of score to succeeded/failed. |
total_time |
float | Optional. Time (in seconds) required to execute task. |
total_tokens |
int | Optional. Tokens spent in execution of task. |
error |
string | Optional. Error message that terminated the task. |
limit |
string | Optional. Limit that caused the task to exit (e.g. “tokens”, “messages, etc.) |
messages |
list[ChatMessage] JSON |
Optional. List of ChatMessage with message history. |
events |
list[Event] JSON |
Optional. List of Event with event history (e.g. model events, tool events, etc.) |
Field types marked with JSON are stored in the database as serialized JSON strings and then converted to richer types when accessed via the Transcript interface.
Metadata
You can include arbitrary other fields in your database which will be made available as Transcript.metadata. These fields can then be used for filtering in calls to Transcripts.where().
Note that metadata columns are forwarded into the results database for scans (transcript_metadata) so it is generally a good practice to not include large amounts of data in these columns.
Messages
The messages field is a JSON encoded string of list[ChatMessage]. There are several helper functions available within the inspect_ai package to assist in converting from the raw message formats of various providers to the Inspect ChatMessage format:
| Provider API | Functions |
|---|---|
| OpenAI Chat Completions | messages_from_openai() model_output_from_openai() |
| OpenAI Responses | messages_from_openai_responses() model_output_from_openai_responses() |
| Anthropic Messages | messages_from_anthropic() model_output_from_anthropic() |
| Google Generate Content | messages_from_google() model_output_from_google() |
For many straightforward transcripts the list of messages will be all that is needed for analysis.
Events
The events field is a JSON encoded string of list[Event]. Note that if your scanners deal entirely in messages rather than events (as a great many do) then it is not necessary to provide events.
Events are typically important when you are either analyzing complex multi-agent transcripts or doing very granular scanning for specific phenomena (e.g. tool calling errors).
While you can include any of the event types in defined in inspect_ai.event, there is a subset that is both likely to be of interest and that maps on to data provided by observability platforms and/or OTEL traces. These include:
| Event | Description |
|---|---|
| ModelEvent | Generation call to a model. |
| ToolEvent | Tool call made by a model. |
| ErrorEvent | Runtime error aborting transcript. |
| SpanBeginEvent | Mark the beginning of a transcript span (e.g. agent execution, tool call, custom block, etc.) |
| SpanEndEvent | Mark the end of a transcript scan |
Most observability systems will have some equivalent of the above in their traces. When reconstructing model events you will also likely want to use the helper functions mentioned above in Messages for converting raw model API payloads to ChatMessage.
The events field is only important if you have scanners that will be doing event analysis. Note that the default llm_scanner() provided within Scout looks only at messages not events.
Importing Data
Now that you understand the schema and have an idea for how you want to map your data into it, use one of the following methods to create the database:
Transcript API: Read and parse transcripts into Transcript objects and use the
TranscriptsDB.insert()function to add them to the database.Arrow Import: Read an existing set of transcripts stored in Arrow/Parquet and pass them to
TranscriptsDB.insert()as a PyArrowRecordBatchReader.Parquet Data Lake: Point the
TranscriptDBat an existing data lake (ensuring that the records adhere to the transcript database schema).Inspect Logs: Import Inspect AI eval logs from a log directory.