cais/swebenchpro

Coding

SWE-bench Pro with anti-exploitation (git history isolation + GitHub network blocking). 731 tasks, Python/JS/TS/Go.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/cais_swebenchpro --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import cais_swebenchpro

eval(cais_swebenchpro(), model="openai/gpt-5")

Dataset information

Harbor registry cais/swebenchpro
Inspect task cais_swebenchpro
Latest digest sha256:0684038ce8eae92d435a27307d1c5843e291152898f429af130062e8df110768
Samples 731
Paper arxiv
Source https://github.com/scaleapi/SWE-bench_Pro-os

See Task Parameters for the parameter set shared across all Harbor tasks.