inspect_viz.sandbox
Evals
evals_summary_plot
Bar plot for comparing evals.
def evals_summary_plot(
evals: Data,str = "model",
x: str = "task_name",
fx: | None = None,
y: AxisValue bool | AxisFilter = False,
x_filter: bool | AxisFilter = False,
fx_filter: -> Component )
evals
Data-
Evals data table (typically read using
evals_df()
) x
str-
Name of field for x axis (defaults to “model”)
fx
str-
Name of field for x facet (defaults to “task_name”)
y
AxisValue | None-
Definition for y axis (defaults to axis_score())
x_filter
bool | AxisFilter-
Optional filtering control for x axis.
fx_filter
bool | AxisFilter-
Optional filtering control for fx axis.
evals_summary_table
Table that summarizes eval scores by model and task.
def evals_summary_table(
str | Column] | None = None
evals: Data, columns: Sequence[-> Component )
evals
Data-
Evals data table.
columns
Sequence[str | Column] | None-
Column definitions (defaults to model, task_name, and headline metric).
Axis
AxisFilter
Filter definition for plot axis.
class AxisFilter(BaseModel)
Attributes
label
str | None-
Filter label (defaults to column namne).
value
Literal['all'] | str | list[str]-
Initial value (defaults to “all” which applies to filter).
multiple
bool-
Enable filtering on multiple values.
width
int | None-
Width of filter input in pixels.
AxisValue
Axis value options.
class AxisValue(BaseModel)
Attributes
label
str-
Axis label.
value_field
str-
Field to read value from.
stderr_field
str | None-
Field to read stderr from (optional, required for plotting confidence intervals).
ci
float | None-
Confidence interval (e.g. 0.80, 0.90, 0.95, etc.).
domain
list[float] | None-
Domain of axis (range of values to display).
axis_score
Axis definition for scores from evals_df()
data frames.
def axis_score(ci: float = 0.95) -> AxisValue
ci
float-
Confidence interval (e.g. 0.80, 0.90, 0.95, etc.).