from inspect_ai.analysis.beta import evals_df, samples_df
from inspect_viz import Data
from inspect_viz.table import table
= evals_df("logs")
df = Data.from_dataframe(df) evals
Evals Summary
Use the evals_summary_plot()
function to get to a quick summary of the scores by task and model in a set of log files:
from inspect_viz.sandbox import evals_summary_plot
evals_summary_plot(evals)
Optionally, add filtering by model (x-axis) and/or task name (fx-axis):
evals_summary_plot(
evals,=True,
x_filter=True
fx_filter )
Use the evals_summary_table()
function to create a table summarizing results:
from inspect_viz.sandbox import evals_summary_table
evals_summary_table(evals)