quesma/compilebench

Coding

CompileBench: real-world build/compile tasks (curl, GNU coreutils, jq, etc.) ranging from easy builds to reviving 2003-era code and cross-compiling.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/quesma_compilebench --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import quesma_compilebench

eval(quesma_compilebench(), model="openai/gpt-5")

Dataset information

Harbor registry quesma/compilebench
Inspect task quesma_compilebench
Latest digest sha256:8b7ea3e0618b0f3fb2db1b5695628cfc2b2d5f405c5624b3b44d1602beca338a
Samples 15
Paper arxiv
Source https://github.com/QuesmaOrg/CompileBench

See Task Parameters for the parameter set shared across all Harbor tasks.