tencent/autocodebench

Coding

Multilingual automated code generation benchmark evaluating LLMs across diverse programming tasks and languages.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/tencent_autocodebench --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import tencent_autocodebench

eval(tencent_autocodebench(), model="openai/gpt-5")

Dataset information

Harbor registry tencent/autocodebench
Inspect task tencent_autocodebench
Latest digest sha256:da30a5e97eeccc2d024a2ff947fb99966ea88bed5b7077ee451d2ae72e645caf
Samples 200
Paper arxiv
Source https://github.com/Tencent-Hunyuan/AutoCodeBenchmark

See Task Parameters for the parameter set shared across all Harbor tasks.