Inspect Harbor
Inspect Harbor provides an interface to run Harbor tasks using Inspect AI.
Installation
Install from PyPI:
pip install inspect-harborOr with uv:
uv add inspect-harborPrerequisites
Before running Harbor tasks, ensure you have:
- Python 3.12 or higher – required by inspect_harbor.
- Docker installed and running – required for execution when using Docker sandbox (default).
- Model API keys – set appropriate environment variables (e.g.
OPENAI_API_KEY,ANTHROPIC_API_KEY).
Quick Start
The fastest way to get started is to run a dataset from the Harbor registry.
CLI:
# Run hello-world dataset
inspect eval inspect_harbor/hello_world --model openai/gpt-5-mini
# Run terminal-bench-sample dataset
inspect eval inspect_harbor/terminal_bench_sample --model openai/gpt-5Python API:
from inspect_ai import eval
from inspect_harbor import hello_world, terminal_bench_sample
# Run hello-world
eval(hello_world(), model="openai/gpt-5-mini")
# Run terminal-bench-sample
eval(terminal_bench_sample(), model="openai/gpt-5")What this does
- Loads the dataset from the Harbor registry.
- Downloads and caches all tasks in the dataset.
- Solves the tasks with the default ReAct agent scaffold.
- Executes in a Docker sandbox environment.
- Stores results in
./logs.
See the Registry for the full list of available datasets, and the Using Harbor guides for more detail on datasets, task parameters, agents, and advanced features.