v3.0.11

  • Update types for Inspect viewer configuration (inspect_ai >= 0.3.236 now required).

v3.0.10

  • Add target_display_name: str | None parameter to audit() / auditor_agent() for overriding the {target_model} template variable in the auditor’s prompts. Use this when the target’s model id is an internal codename or pre-release identifier that would break scenario realism if the auditor mentioned it to the target. Defaults to the target model’s id (existing behavior).
  • Add generate: AuditorGenerate | None parameter to auditor_agent() for overriding the per-turn auditor model call. The hook receives (model, messages, tools, cache) and returns the ModelOutput to use for that turn — use it for best-of-N sampling or other whole-turn selection that an Approver (which fires per tool call, after the message is appended) cannot express.
  • Add inspect_petri.util namespace with MarkdownDoc, parse_markdown_file(), read_markdown_directory(), and write_markdown_directory() — the generic frontmatter-.md collection loader that seeds_dataset and judge_dimensions use internally. Lets downstream packages define their own frontmatter-md collections (e.g. probe questions) without reimplementing the parser or routing through Sample. MarkdownDoc.to_markdown() writes the BIG-Bench benchmark-data canary string (BENCHMARK_CANARY) into the YAML frontmatter by default so generated files are filtered from training corpora; pass canary=None to suppress.
  • Add exclude: Collection[str] to auditor_tools() for dropping tools by name after the structured options (rollback, target_tools) are applied. Covers cases those flags don’t — e.g. excluding set_system_message when a custom target supplies its own system prompt, or excluding restart_conversation while keeping rollback_conversation.
  • Add model override to audit_judge() for rescoring existing logs with a different judge model.
  • Expose sample metadata as template variables in the auditor system_message / user_message (and via controller().state.metadata for custom tools). Built-in variables take precedence; populate metadata via seed front matter or a setup solver chained before audit_solver.
  • Add TargetContext.metadata and deprecate the third metadata positional arg to custom target agents — use execute(state, context) and read context.metadata. The 3-arg form still works with a DeprecationWarning.

v3.0.9

  • Improve branching behavior for both judge and viewer.
  • Viewer: Customize inspect view for Petri transcripts.

v3.0.8

  • Pass sample metadata to target agent.
  • Add seed instructions to transcript for both auditor and target timelines.
  • Add id:<id>[,<id>...] selector to seed_instructions for selecting specific built-in seeds by id.
  • Update compaction to use latest Inspect compaction protocol.

v3.0.6

  • Add cache: bool | CachePolicy parameter to audit() covering all model roles. Target generations are scoped by trajectory branch path so sibling rollback branches with identical input get distinct cache keys; new TargetContext.scoped_cache() / TargetContext.trajectory_path expose this for custom target agents.
  • realism_approver now catches RefusalError from the realism model and fails open (approves), consistent with its existing fail-open behaviour for invalid structured output, so a content-filter refusal mid-audit no longer crashes the sample.
  • realism_approver is now classified as a utility agent so not displayed in the main auditor timeline by default.
  • audit_judge now catches RefusalError from the judge model and returns Result(value=None, explanation=..., metadata={"refusal": True}) instead of letting the exception crash the sample, so the audit transcript is preserved even when the judge refuses to score it.
  • Surface target generate errors to the auditor.
  • Validate model provided tool name in create_tool() function.
  • Add keep_tools parameter to restart_conversation() (defaulting to True) to preserve previously defined tools.
  • Bugfix: Prevent deadlock when end_conversation() co-occurs with other tools.
  • Bugfix: Strip trailing whitespace from prefill in resume tool.

v3.0.5

  • Ensure that ChatMessageTool is staged with the tool name (required by Gemini).
  • Tool views: Use snake case for consistent presentation in Inspect View.
  • create_tool now accepts parameters as a plain dict and normalizes/validates it internally, instead of typing it as ToolParams. The previous typing produced a deeply-nested wire schema that some providers reject outright (Gemini 2.5 Pro: “too many states for serving”; xAI Grok: unsupported boolean additionalProperties). Invalid schemas now surface as a recoverable ToolError to the auditor.

v3.0.4

  • Print warning if the auditor loop exits without any successful target turns.
  • Refactor target protocol: new auditor-side Controller and target-side TargetContext porcelain over a stripped-down Channel transport. Custom targets now take execute(state, context: TargetContext).
  • Staging is now driven by typed Slot[T] constants (SYSTEM, USER, TOOL_RESULT, PREFILL) with a single Stage command, replacing StageMessage/StageToolResult.
  • Add TargetContext.expect() to declare per-slot allowed-key policies; out-of-policy Stage commands now raise TargetError immediately.
  • Add ReplayDesyncError, raised when trajectory replay diverges from the recorded step sequence.
  • set_system_message now rejects mid-conversation calls with a ToolError.
  • Removed: auditor()/target() accessors, Auditor/Target, StageMessage, StageToolResult, SendData, ResponseData, ResponseStaged.
  • Require inspect-ai>=0.3.211.
  • Add turn_counter option to auditor to print a reminder of the current turn and remaining turns (defaults to False).
  • Handle malformed JSONSchema from model when creating tool views (normalize uppercase keys to lowercase).

v3.0.3

  • Add instructions parameter to audit_judge() for injecting additional content into the judge system prompt.

v3.0.2

  • Migrate from Petri v2 repo (see petri-v2 branch for v2 code).

v3.0.1

  • Integrated support for target/auditor/judge in Inspect View.

v3.0.0

  • Initial development release for Petri 3.0.