evoeval/evoeval
Coding
EvoEval: evolving suite that mutates HumanEval problems along several axes (difficulty, creative, subtle, tool-use) for a contamination-resistant view of LLM coding ability.
Run this task
CLI:
inspect eval inspect_harbor/evoeval --model openai/gpt-5Python:
from inspect_ai import eval
from inspect_harbor import evoeval
eval(evoeval(), model="openai/gpt-5")Dataset information
| Harbor registry | evoeval/evoeval |
| Inspect task | evoeval |
| Latest digest | sha256:97c284d905f075164e1ae262e853f2e6e27a3448d07e623f64205216410040ea |
| Samples | 100 |
| Source | https://github.com/evo-eval/evoeval |
See Task Parameters for the parameter set shared across all Harbor tasks.