from inspect_viz import Data
from inspect_viz.view.beta import tool_calls
= Data.from_file("cybench_tools.parquet")
tools tool_calls(tools)
Tool Calls
Overview
The tool_calls() function creates a heat map visualising tool calls over evaluation turns.
Data Preparation
To create the plot we read a raw messages data frame from an eval log using the messages_df()
function, then filter down to just the fields we require for visualization:
from inspect_ai.analysis import messages_df, log_viewer, model_info, prepare, EvalModel, MessageColumns, SampleSummary
# read messages from log
= "<path-to-log>.eval"
log
# Be sure to add EvalModel column so links can be prepared
= messages_df(log, columns=EvalModel + SampleSummary + MessageColumns)
df
# trim columns
= df[[
df "eval_id",
"sample_id",
"message_id",
"model",
"id",
"order",
"tool_call_function",
"limit",
"log"
]]
# prepare the data frame with model info and log links
= prepare(df, [
df
model_info(),"message", url_mappings={
log_viewer("logs": "https://samples.meridianlabs.ai/"
})
])
# write to parquet
"cybench_tools.parquet") df.to_parquet(
Note that the trimming of columns is particularly important because Inspect Viz embeds datasets directly in the web pages that host them (so we want to minimize their size for page load performance and bandwidth usage).
Function Reference
Heat map visualising tool calls over evaluation turns.
def tool_calls(
data: Data,str = "order",
x: str = "id",
y: str = "tool_call_function",
tool: str = "limit",
limit: list[str] | None = None,
tools: str | None = "Message",
x_label: str | None = "Sample",
y_label: str | Title | None = None,
title: | None = None,
marks: Marks float | None = None,
width: float | None = None,
height: **attributes: Unpack[PlotAttributes],
-> Component )
data
Data-
Messages data table. This is typically created using a data frame read with the inspect
messages_df()
function. x
str-
Name of field for x axis (defaults to “order”)
y
str-
Name of field for y axis (defaults to “id”).
tool
str-
Name of field with tool name (defaults to “tool_call_function”)
limit
str-
Name of field with sample limit (defaults to “limit”).
tools
list[str] | None-
Tools to include in plot (and order to include them). Defaults to all tools found in
data
. x_label
str | None-
x-axis label (defaults to “Message”).
y_label
str | None-
y-axis label (defaults to “Sample”).
title
str | Title | None-
Title for plot (
str
or mark created with the title() function) marks
Marks | None-
Additional marks to include in the plot.
width
float | None-
The outer width of the plot in pixels, including margins. Defaults to 700.
height
float | None-
The outer height of the plot in pixels, including margins. The default is width / 1.618 (the golden ratio)
**attributes
Unpack[PlotAttributes]-
Additional PlotAttributes. By default, the
margin_top
is set to 0,margin_left
to 20,margin_right
to 100,color_label
is “Tool”,y_ticks
is empty, andx_ticks
andcolor_domain
are calculated fromdata
.
Implementation
The Tool Calls example demonstrates how this view was implemented using lower level plotting components.