adyen/dabstep
Professional
Finance
Assistants
Coding
DABstep: real-world data analysis tasks from Adyen’s workloads requiring multi-step reasoning by LLM agents.
Run this task
CLI:
inspect eval inspect_harbor/adyen_dabstep --model openai/gpt-5Python:
from inspect_ai import eval
from inspect_harbor import adyen_dabstep
eval(adyen_dabstep(), model="openai/gpt-5")Dataset information
| Harbor registry | adyen/dabstep |
| Inspect task | adyen_dabstep |
| Latest digest | sha256:0edf62c0bdf7003b1d1f934f1547df1c051877e076d5b6f6a2d99caf8b6432b3 |
| Samples | 450 |
| Paper | arxiv |
| Source | https://huggingface.co/datasets/adyen/DABstep |
See Task Parameters for the parameter set shared across all Harbor tasks.