quesma/compilebench

Coding

CompileBench: real-world build/compile tasks (curl, GNU coreutils, jq, etc.) ranging from easy builds to reviving 2003-era code and cross-compiling.

Run this task

CLI:

inspect eval inspect_harbor/quesma_compilebench --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import quesma_compilebench

eval(quesma_compilebench(), model="openai/gpt-5")

Harbor registry	quesma/compilebench
Inspect task	`quesma_compilebench`
Latest digest	sha256:8b7ea3e0618b0f3fb2db1b5695628cfc2b2d5f405c5624b3b44d1602beca338a
Samples	15
Paper	arxiv
Source	https://github.com/QuesmaOrg/CompileBench

See Task Parameters for the parameter set shared across all Harbor tasks.