Evals Summary

from inspect_ai.analysis.beta import evals_df, samples_df
from inspect_viz import Data
from inspect_viz.table import table

df = evals_df("logs")
evals = Data.from_dataframe(df)

Use the evals_summary_plot() function to get to a quick summary of the scores by task and model in a set of log files:

from inspect_viz.sandbox import evals_summary_plot

evals_summary_plot(evals)

Optionally, add filtering by model (x-axis) and/or task name (fx-axis):

evals_summary_plot(
    evals,
    x_filter=True,
    fx_filter=True
)

Use the evals_summary_table() function to create a table summarizing results:

from inspect_viz.sandbox import evals_summary_table

evals_summary_table(evals)