quesma/compilebench
Coding
CompileBench: real-world build/compile tasks (curl, GNU coreutils, jq, etc.) ranging from easy builds to reviving 2003-era code and cross-compiling.
Run this task
CLI:
inspect eval inspect_harbor/quesma_compilebench --model openai/gpt-5Python:
from inspect_ai import eval
from inspect_harbor import quesma_compilebench
eval(quesma_compilebench(), model="openai/gpt-5")Dataset information
| Harbor registry | quesma/compilebench |
| Inspect task | quesma_compilebench |
| Latest digest | sha256:8b7ea3e0618b0f3fb2db1b5695628cfc2b2d5f405c5624b3b44d1602beca338a |
| Samples | 15 |
| Paper | arxiv |
| Source | https://github.com/QuesmaOrg/CompileBench |
See Task Parameters for the parameter set shared across all Harbor tasks.