futurehouse/bixbench-cli

Science
Biology
Coding

CLI variant of BixBench: agents solve the same bioinformatics analysis tasks via a command-line / shell interface rather than notebook authoring.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/futurehouse_bixbench_cli --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import futurehouse_bixbench_cli

eval(futurehouse_bixbench_cli(), model="openai/gpt-5")

Dataset information

Harbor registry futurehouse/bixbench-cli
Inspect task futurehouse_bixbench_cli
Latest digest sha256:a856307be0c75e7403e9113e65c986d897dead9dbe416f588cfc60a15f1b14c2
Samples 205
Paper arxiv
Source https://github.com/Future-House/BixBench

See Task Parameters for the parameter set shared across all Harbor tasks.