from inspect_viz import Data
from inspect_viz.view.beta import scores_heatmap
= Data.from_file("evals.parquet")
evals =200, legend=True) scores_heatmap(evals, height
Scores Heatmap
Overview
The scores_heatmap()function renders a heatmap for comparing eval scores.
Data Preparation
Above we read the data for the plot from a parquet file. This file was in turn created by:
Reading logs into a data frame with
evals_df()
.Using the
prepare()
function to addmodel_info()
andlog_viewer()
columns to the data frame.
from inspect_ai.analysis import evals_df, log_viewer, model_into, prepare
= evals_df("logs")
df = prepare(df,
df
model_info(),"eval", {"logs": "https://samples.meridianlabs.ai/"}),
log_viewer(
)"evals.parquet") df.to_parquet(
You can additionally use the task_info()
operation to map lower-level task names to task display names (e.g. “gpqa_diamond” -> “GPQA Diamond”).
Note that both the log viewer links and model names are optional (the plot will render without links and use raw model strings if the data isn’t prepared with log_viewer()
and model_info()
).
Function Reference
Creates a heatmap plot of success rate of eval data.
def scores_heatmap(
data: Data,str = "task_display_name",
task_name: str | None | NotGiven = None,
task_label: str = "model_display_name",
model_name: str | None | NotGiven = None,
model_label: str = "score_headline_value",
score_value: | None = None,
cell: CellOptions bool = True,
tip: str | Title | None = None,
title: | None = None,
marks: Marks float | None = None,
height: float | None = None,
width: | bool | None = None,
legend: Legend "ascending", "descending"] | SortOrder | None = "ascending",
sort: Literal["horizontal", "vertical"] = "horizontal",
orientation: Literal[**attributes: Unpack[PlotAttributes],
-> Component )
data
Data-
Evals data table.
task_name
str-
Name of column to use for columns.
task_label
str | None | NotGiven-
x-axis label (defaults to None).
model_name
str-
Name of column to use for rows.
model_label
str | None | NotGiven-
y-axis label (defaults to None).
score_value
str-
Name of the column to use as values to determine cell color.
cell
CellOptions | None-
Options for the cell marks.
tip
bool-
Whether to show a tooltip with the value when hovering over a cell (defaults to True).
title
str | Title | None-
Title for plot (
str
or mark created with the title() function) marks
Marks | None-
Additional marks to include in the plot.
height
float | None-
The outer height of the plot in pixels, including margins. The default is width / 1.618 (the golden ratio).
width
float | None-
The outer width of the plot in pixels, including margins. Defaults to 700.
legend
Legend | bool | None-
Options for the legend. Pass None to disable the legend.
sort
Literal['ascending', 'descending'] | SortOrder | None-
Sort order for the x and y axes. If ascending, the highest values will be sorted to the top right. If descending, the highest values will appear in the bottom left. If None, no sorting is applied. If a SortOrder is provided, it will be used to sort the x and y axes.
orientation
Literal['horizontal', 'vertical']-
The orientation of the heatmap. If “horizontal”, the tasks will be on the x-axis and models on the y-axis. If “vertical”, the tasks will be on the y-axis and models on the x-axis.
**attributes
Unpack[PlotAttributes]-
Additional `PlotAttributes
Implementation
The Scores Heatmap example demonstrates how this view was implemented using lower level plotting components.