quesma/otel-bench

Coding

AI-agent benchmark for OpenTelemetry instrumentation tasks across multiple programming languages.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/quesma_otel_bench --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import quesma_otel_bench

eval(quesma_otel_bench(), model="openai/gpt-5")

Dataset information

Harbor registry quesma/otel-bench
Inspect task quesma_otel_bench
Latest digest sha256:a6ca75f833dedb831238b42c5dccab7f4d95713db9f6933560a6cca2c052b4b9
Samples 26
Source https://github.com/QuesmaOrg/otel-bench

See Task Parameters for the parameter set shared across all Harbor tasks.