openai/swe-lancer-diamond-ic

Coding

SWE-Lancer Diamond (IC): individual-contributor split of OpenAI’s SWE-Lancer benchmark — real Upwork freelance software-engineering issues fixed in-repo and graded by end-to-end tests.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/openai_swe_lancer_diamond_ic --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import openai_swe_lancer_diamond_ic

eval(openai_swe_lancer_diamond_ic(), model="openai/gpt-5")

Dataset information

Harbor registry	openai/swe-lancer-diamond-ic
Inspect task	`openai_swe_lancer_diamond_ic`
Latest digest	sha256:d0645e1152d417dd3ec8b36c324c03a8729b3fa48c8840f8935f93582c4dce28
Samples	198
Paper	arxiv
Source	https://github.com/openai/SWELancer-Benchmark

See Task Parameters for the parameter set shared across all Harbor tasks.