nvats/codeskills-bench

Coding

A small set of real-life programming tasks: bug fixes, merge-conflict resolution, dependency cleanup, API migration, and performance regressions across compact Python repositories.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/nvats_codeskills_bench --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import nvats_codeskills_bench

eval(nvats_codeskills_bench(), model="openai/gpt-5")

Dataset information

Harbor registry nvats/codeskills-bench
Inspect task nvats_codeskills_bench
Latest digest sha256:eeeb856e813c7c3a27a65ca459eff6e254b081560cf9d45f53503a14db527156
Samples 23
Source https://github.com/namanvats/codeskills-bench

See Task Parameters for the parameter set shared across all Harbor tasks.