futurehouse/bixbench-cli

Science

Biology

Coding

CLI variant of BixBench: agents solve the same bioinformatics analysis tasks via a command-line / shell interface rather than notebook authoring.

Run this task

CLI:

inspect eval inspect_harbor/futurehouse_bixbench_cli --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import futurehouse_bixbench_cli

eval(futurehouse_bixbench_cli(), model="openai/gpt-5")

Harbor registry	futurehouse/bixbench-cli
Inspect task	`futurehouse_bixbench_cli`
Latest digest	sha256:a856307be0c75e7403e9113e65c986d897dead9dbe416f588cfc60a15f1b14c2
Samples	205
Paper	arxiv
Source	https://github.com/Future-House/BixBench

See Task Parameters for the parameter set shared across all Harbor tasks.