satbench/satbench

Reasoning

SATBench: logical-reasoning puzzles automatically generated from SAT formulas with adjustable difficulty, validated through both LLM and SAT-solver consistency checks.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/satbench --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import satbench

eval(satbench(), model="openai/gpt-5")

Dataset information

Harbor registry satbench/satbench
Inspect task satbench
Latest digest sha256:4b921bb49ebe0513a784783eeac9561e9d216339de1e4cb20c43018dd0502a1e
Samples 1000
Paper arxiv
Source https://github.com/Anjiang-Wei/SATBench

See Task Parameters for the parameter set shared across all Harbor tasks.