adyen/dabstep

Professional
Finance
Assistants
Coding

DABstep: real-world data analysis tasks from Adyen’s workloads requiring multi-step reasoning by LLM agents.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/adyen_dabstep --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import adyen_dabstep

eval(adyen_dabstep(), model="openai/gpt-5")

Dataset information

Harbor registry adyen/dabstep
Inspect task adyen_dabstep
Latest digest sha256:0edf62c0bdf7003b1d1f934f1547df1c051877e076d5b6f6a2d99caf8b6432b3
Samples 450
Paper arxiv
Source https://huggingface.co/datasets/adyen/DABstep

See Task Parameters for the parameter set shared across all Harbor tasks.