evoeval/evoeval

Coding

EvoEval: evolving suite that mutates HumanEval problems along several axes (difficulty, creative, subtle, tool-use) for a contamination-resistant view of LLM coding ability.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/evoeval --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import evoeval

eval(evoeval(), model="openai/gpt-5")

Dataset information

Harbor registry evoeval/evoeval
Inspect task evoeval
Latest digest sha256:97c284d905f075164e1ae262e853f2e6e27a3448d07e623f64205216410040ea
Samples 100
Source https://github.com/evo-eval/evoeval

See Task Parameters for the parameter set shared across all Harbor tasks.