openai/simpleqa

Knowledge

SimpleQA: short, fact-seeking questions adversarially collected against GPT-4 to measure short-form factuality and calibration of frontier LLMs.

← Back to Registry

Run this task

CLI:

inspect eval inspect_harbor/openai_simpleqa --model openai/gpt-5

Python:

from inspect_ai import eval
from inspect_harbor import openai_simpleqa

eval(openai_simpleqa(), model="openai/gpt-5")

Dataset information

Harbor registry openai/simpleqa
Inspect task openai_simpleqa
Latest digest sha256:22f25921ded881aca13cf5d18b8d3bbc91e2b9bf44d17108292dcc40fcb5f0d4
Samples 1000
Paper arxiv
Source https://github.com/openai/simple-evals

See Task Parameters for the parameter set shared across all Harbor tasks.