futurehouse/bixbench-cli
Science
Biology
Coding
CLI variant of BixBench: agents solve the same bioinformatics analysis tasks via a command-line / shell interface rather than notebook authoring.
Run this task
CLI:
inspect eval inspect_harbor/futurehouse_bixbench_cli --model openai/gpt-5Python:
from inspect_ai import eval
from inspect_harbor import futurehouse_bixbench_cli
eval(futurehouse_bixbench_cli(), model="openai/gpt-5")Dataset information
| Harbor registry | futurehouse/bixbench-cli |
| Inspect task | futurehouse_bixbench_cli |
| Latest digest | sha256:a856307be0c75e7403e9113e65c986d897dead9dbe416f588cfc60a15f1b14c2 |
| Samples | 205 |
| Paper | arxiv |
| Source | https://github.com/Future-House/BixBench |
See Task Parameters for the parameter set shared across all Harbor tasks.