binary-audit/binary-audit

Cybersecurity

BinaryAudit: AI-agent benchmark for finding backdoors hidden in compiled binaries via reverse engineering.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/binary_audit --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import binary_audit

eval(binary_audit(), model="openai/gpt-5")

Dataset information

Harbor registry binary-audit/binary-audit
Inspect task binary_audit
Latest digest sha256:b1ad968bbe8ad16a5c4463814d169cfbda30d08c179b9a5401e5db63953f6bc2
Samples 46
Source https://github.com/QuesmaOrg/BinaryAudit

See Task Parameters for the parameter set shared across all Harbor tasks.