Running Flows

Once you’ve defined your Flow configuration, you can execute evaluations using the flow run command. Flow also provides tools for previewing configurations and controlling runtime behavior.

The flow run Command

Execute your evaluation workflow:

flow run config.py

What happens when you run this:

  1. Flow loads your configuration file
  2. Creates an isolated virtual environment
  3. Installs dependencies
  4. Resolves all defaults and matrix expansions
  5. Executes evaluations via Inspect AI’s eval_set()
  6. Stores logs in log_dir
  7. Cleans up the temporary environment

Common CLI Flags

Preview without running:

flow run config.py --dry-run

Shows the completely resolved configuration:

  • Creates virtual environment
  • Applies all defaults
  • Expands all matrix functions
  • Instantiates all Python objects

This is invaluable for debugging what settings will actually be used in your evaluations.

Override log directory:

flow run config.py --log-dir ./experiments/baseline

Changes where logs and results are stored.

Runtime overrides:

flow run config.py \
  --set options.limit=100 \
  --set defaults.config.temperature=0.5

Override any configuration value at runtime. See CLI Overrides for more details.

The flow config Command

Preview your configuration before running:

flow config config.py

Displays the parsed configuration as YAML with CLI overrides applied. Does not create a virtual environment or instantiate Python objects.

TipWhen to Use Each Command
  • flow config - Quick syntax check, verify overrides
  • flow run --dry-run - Debug defaults resolution, inspect final settings
  • flow run - Execute evaluations

Running from Python

You can run Flow evaluations programmatically using the Python API:

run.py
from inspect_flow import FlowSpec, FlowTask
from inspect_flow.api import run

spec = FlowSpec(
    log_dir="logs",
    tasks=[
        FlowTask(
            name="inspect_evals/gpqa_diamond",
            model="openai/gpt-4o",
        ),
        FlowTask(
            name="inspect_evals/mmlu_0_shot",
            model="openai/gpt-4o",
        ),
    ],
)
run(spec=spec)

The inspect_flow.api module provides three functions:

  • run() - Execute a Flow spec with full environment setup (equivalent to flow run)
  • load_spec() - Load a Flow configuration from a Python file into a FlowSpec object
  • config() - Get the resolved configuration as YAML (equivalent to flow config)

See the API Reference for detailed parameter documentation.

Results and Logs

Logs Directory

Evaluation results are stored in the log_dir:

logs/
├── 2025-11-21T17-38-20+01-00_gpqa-diamond_KvJBGowidXSCLRhkKQbHYA.eval
├── 2025-11-21T17-38-20+01-00_mmlu-0-shot_Vnu2A3M2wPet5yobLiCQmZ.eval
├── .eval-set-id
├── eval-set.json
├── flow.yaml
├── flow-requirements.txt
└── ...

Directory structure:

  • Flow passes the log_dir directly to Inspect AI eval_set() for evaluation log storage
  • Inspect AI handles the actual evaluation log file naming and storage
  • Log file naming conventions follow Inspect AI’s standards (see Inspect AI logging docs)
  • Flow automatically saves the resolved configuration as flow.yaml in the log directory
  • Flow saves the exact version of packages installed in the virtual environment as flow-requirements.txt
  • The .eval-set-id file contains the eval set identifier
  • The eval-set.json file contains eval set metadata

Log formats:

  • .eval - Binary Inspect AI log format (default, high-performance)
  • .json - JSON format (if log_format="json" in FlowOptions)

Viewing Results

Using Inspect View:

inspect view

Opens the Inspect AI viewer to explore evaluation logs interactively. Inspect View can automatically detect Flow config files in the log directory and render them in the UI, making it easier to review the spec for the evaluations.

Click the Flow icon in the top right hand corner to view the Flow config.

Eval list rendered by Inspect View

The Flow config file is rendered in YAML format.

Flow config rendered by Inspect View

S3 Support

Store logs directly to S3:

FlowSpec(
    log_dir="s3://my-bucket/experiments/baseline",
    tasks=[...]
)

For more information on configuring an S3 bucket as a logs directory, refer to the Inspect AI documentation.