Flow Store

The Flow Store is a centralized registry of evaluation log file paths that tracks your completed evaluations and enables log reuse across directories, projects, and team members.

Note

Unlike Inspect AI’s Eval Set which only reuses logs from the same log directory, the Flow Store indexes logs from multiple locations and makes them available for reuse across any Flow run.

How It Works

The Flow Store has two independent capabilities: log indexing (writing to the store) and log matching (reading from the store). By default, indexing is on and matching is off — you opt into matching when you want cross-directory log reuse.

Log Indexing

Log indexing is on by default and requires zero configuration. After each flow run, Flow automatically adds your completed logs to the Flow Store. This builds up a registry of all your evaluations over time, ready for reuse when you need it.

On your first run, the Flow Store is automatically created in the background at the default location.

Log Matching

Log matching is off by default. Enable it with --store-read to reuse logs from across directories and projects:

flow run config.py --store-read

When log matching is enabled, Flow:

  1. Searches the Flow Store for logs matching your task (based on task configuration)
  2. Selects the best log (most completed samples, then most recent)
  3. Copies it to your log directory for Inspect AI’s Eval Set to reuse
  4. Resumes incomplete logs by only running missing samples
Note

Even without --store-read, Inspect AI’s Eval Set still reuses logs already present in your log_dir. Store matching adds cross-directory reuse on top of that.

Example

Let’s walk through a concrete example showing how the Flow Store tracks and reuses logs across runs.

Run 1: Initial evaluation

Start with a simple configuration evaluating one model:

config.py
from inspect_flow import FlowSpec, FlowTask

FlowSpec(
    log_dir="project-1",
    tasks=[
        FlowTask(
            name="inspect_evals/gpqa_diamond",
            model="openai/gpt-4o",
        )
    ],
)
1
Evaluate gpt-4o on gpqa_diamond

When you run flow run config.py, Flow:

  1. Runs the evaluation
  2. Saves logs to project-1/
  3. Indexes the log in the Flow Store: project-1/2026-01-31T15-55_gpqa-diamond_ABCeD.eval

Run 2: Cross-directory reuse

Now modify your config to use a new log directory and add a second model:

config.py
from inspect_flow import FlowSpec, FlowTask

FlowSpec(
-   log_dir="project-1",
+   log_dir="project-2",
    tasks=[
        FlowTask(
            name="inspect_evals/gpqa_diamond",
            model="openai/gpt-4o",
        ),
+       FlowTask(
+           name="inspect_evals/gpqa_diamond",
+           model="openai/gpt-5",
+        )
    ]
)
1
Change to a new log directory
2
Same task from Run 1 (gpt-4o on gpqa_diamond)
3
New task (gpt-5 on gpqa_diamond)

Since you’re using a new log directory with no existing logs, enable store matching to reuse Run 1’s results:

flow run config.py --store-read

Flow:

  1. Searches the Flow Store and finds Run 1’s gpt-4o log from project-1/
  2. Copies it to project-2/ for reuse
  3. Skips re-evaluating gpt-4o (already complete)
  4. Only evaluates the new gpt-5 task
  5. Saves both logs to project-2/
  6. Indexes both log paths in the Flow Store

Result: The second run completes much faster by reusing the gpt-4o evaluation from Run 1.

Note

If Run 1 was interrupted with only 80/100 samples completed, Run 2 would copy the partial log and automatically resume from where it left off, only evaluating the remaining 20 samples for gpt-4o.

Backend

The Flow Store is a lightweight registry that tracks log file paths, not the logs themselves. For each log, the Flow Store records:

  • Log file path: Absolute path to the log file
  • Task identifier: Hash of the task configuration (for matching)
  • Timestamp: When the log was added to the Flow Store

The Flow Store uses Delta Lake as its storage backend, providing ACID transactions and efficient querying. The Flow Store supports both local filesystems and S3 storage. By default, the Flow Store is automatically created when you first run flow run or import logs with flow store import, and is located at:

~/.local/share/inspect_flow  # Linux
~/Library/Application Support/inspect_flow  # macOS
C:\Users\<username>\AppData\Local\inspect_flow  # Windows

Configuration

Read and write control

Log indexing (write) and log matching (read) are controlled independently via CLI flags or FlowStoreConfig:

CLI flags:

--store-read / --no-store-read    # Match logs from the store (default: --no-store-read)
--store-write / --no-store-write  # Index completed logs in the store (default: --store-write)

FlowSpec:

from inspect_flow import FlowSpec, FlowStoreConfig

FlowSpec(
    store=FlowStoreConfig(
        read=True,   # match logs from the store (default: False)
        write=True,  # index completed logs (default: True)
    ),
    tasks=[...]
)

CLI flags override FlowStoreConfig values. Here’s how the flags combine:

Command Store read Store write
flow run spec.py no yes
flow run spec.py --store-read yes yes
flow run spec.py --no-store-write no no
flow run spec.py --store-read --no-store-write yes no
flow run spec.py --store none no no

--store none disables the store entirely — no store instance is created, and neither reading nor writing occurs.

Setting store location

By default, Flow uses the platform-specific store location shown in Backend to index and match logs. You can configure which store to use for the flow run command with the following precedence:

  1. CLI flag or environment variable (highest priority):
# CLI flag
flow run config.py --store /path/to/my-project-store

# Or environment variable
export INSPECT_FLOW_STORE=/path/to/my-project-store
flow run config.py
  1. FlowSpec parameter:
FlowSpec(
    store="./my-project-store",
    tasks=[...]
)
  1. Default location (lowest priority): Platform-specific default from Backend

This allows you to maintain multiple stores for different projects or purposes, and choose which one to use for each run. The CLI flag and environment variable are useful for temporarily overriding the store location without modifying your config file.

Disabling the store

Disable the Flow Store entirely by setting the store parameter to None:

FlowSpec(
    store=None,
    tasks=[...]
)

This prevents all Flow Store operations — no logs are indexed or matched.

To disable only matching or only indexing while keeping the other, use the read and write flags instead (see Read and write control).

Team collaboration

S3-based shared store:

Set up a shared store on S3 for team collaboration:

FlowSpec(
    store=FlowStoreConfig(
        path="s3://my-team-bucket/flow-store",
        read=True,
    ),
    tasks=[...]
)

With read=True, team members reuse each other’s logs automatically. Delta Lake supports concurrent access, so multiple team members can safely use the same store simultaneously. See Inspect AI’s S3 documentation for S3 authentication setup.

TipRecommended workflows

Master FlowSpec approach: Create a _flow.py file at your repository root with the team store location and read=True:

Flow automatically discovers and inherits from _flow.py files in parent directories, so all team members working in the repository will inherit the shared store configuration — including read=True for log matching. No explicit includes needed. See Automatic Discovery for details on how _flow.py inheritance works.

Local-then-import approach: Run evaluations locally first, then import quality-assured logs to the shared store:

# Run locally (with personal store)
flow run config.py

# After QA, import to team store
flow store import s3://my-team-bucket/logs/ \
    --copy-from ./logs/ \
    --store s3://my-team-bucket/flow-store

This workflow lets you validate results before sharing them with the team.

Filtering

You can filter which logs are eligible for reuse by providing one or more log filters. A log filter is a function that receives an EvalLog (loaded in header-only mode) and returns True to include the log or False to exclude it. When multiple filters are provided, all must pass for a log to be included.

Defining a filter

Use the @log_filter decorator to register a filter function by name:

from inspect_ai.log import EvalLog
from inspect_flow import log_filter

@log_filter
def production_ready(log: EvalLog) -> bool:
    return log.status == "success" and "reviewed" in (log.eval.tags or [])

Using a filter with flow run

In your FlowSpec — use FlowStoreConfig to attach a filter to the store:

from inspect_flow import FlowSpec, FlowStoreConfig

FlowSpec(
    store=FlowStoreConfig(
        path="auto",
        filter=production_ready,  # callable, registered name, or list
    ),
    tasks=[...]
)

From the CLI — use --store-filter with a registered filter name. Use it multiple times to require all filters to pass:

flow run config.py --store-filter production_ready
flow run config.py --store-filter production_ready --store-filter only_success

The CLI flag overrides any filter set in the spec’s FlowStoreConfig.

Using filters with store commands

The flow store list, flow store remove, and flow list log commands support --filter and --exclude options:

# Only show logs that pass the filter
flow store list --filter production_ready

# Multiple filters (all must pass)
flow store list --filter production_ready --filter only_success

# Only show logs that do NOT pass the filter
flow store list --exclude production_ready

--filter and --exclude are mutually exclusive.

CLI Commands

The Flow Store provides commands to manually manage logs in the Flow Store. All commands support the --store flag to specify which store to operate on.

The flow store import Command

Add logs to the Flow Store from directories or files:

flow store import logs/

Import from multiple locations:

flow store import logs/ archived_logs/

Import specific files:

flow store import logs/eval_001.json logs/eval_002.json

Copy logs to S3 and import:

flow store import s3://my-bucket/logs/ --copy-from ./local-logs/

This copies all logs from ./local-logs/ to s3://my-bucket/logs/, then imports them into the Flow Store. Useful for centralizing logs across team members.

Control recursive search:

flow store import logs/ --no-recursive

By default, flow store import recursively searches subdirectories for log files. Use --no-recursive to only import logs directly in the specified directory.

Preview before importing:

flow store import logs/ --dry-run

Environment variables:

  • INSPECT_FLOW_STORE - Store location (same as --store)
  • INSPECT_FLOW_STORE_PATH - Log paths to import
  • INSPECT_FLOW_STORE_RECURSIVE - Enable recursive search (true/false)
  • INSPECT_FLOW_STORE_IMPORT_COPY_FROM - Copy source directory (same as --copy-from)
  • INSPECT_FLOW_STORE_IMPORT_DRY_RUN - Enable dry run mode (true/false)

The flow store remove Command

Remove logs from the Flow Store by path prefix:

flow store remove logs/2024-01-15/

Remove multiple prefixes:

flow store remove logs/experiment_1/ logs/experiment_2/

Remove missing logs (that no longer exist on disk):

flow store remove --missing

Control recursive search:

flow store remove logs/experiments/ --no-recursive

By default, flow store remove recursively searches subdirectories for matching logs. Use --no-recursive to only remove logs directly in the specified directory.

Preview before removing:

flow store remove logs/old/ --dry-run

Environment variables:

  • INSPECT_FLOW_STORE - Store location (same as --store)
  • INSPECT_FLOW_STORE_PREFIX - Log path prefixes to remove
  • INSPECT_FLOW_STORE_RECURSIVE - Enable recursive search (true/false)
  • INSPECT_FLOW_STORE_REMOVE_MISSING - Remove missing logs (true/false)
  • INSPECT_FLOW_STORE_REMOVE_DRY_RUN - Enable dry run mode (true/false)
  • INSPECT_FLOW_STORE_FILTER - Registered filter name; only remove logs that pass (--filter). Space-separate multiple names (all must pass)
  • INSPECT_FLOW_STORE_EXCLUDE - Registered filter name; only remove logs that do NOT pass (--exclude)

The flow store list Command

View all logs in the Flow Store:

flow store list

Tree format:

flow store list --format tree

Environment variables:

  • INSPECT_FLOW_STORE - Store location (same as --store)
  • INSPECT_FLOW_STORE_LIST_FORMAT - Output format (flat or tree)
  • INSPECT_FLOW_STORE_FILTER - Registered filter name; only show logs that pass (--filter). Space-separate multiple names (all must pass)
  • INSPECT_FLOW_STORE_EXCLUDE - Registered filter name; only show logs that do NOT pass (--exclude)
Tip

For richer log listing with filtering by task, model, status, and date range, see the flow list log command.

The flow store info Command

View Flow Store statistics:

flow store info

Shows the Flow Store location, number of logs, number of log directories, and Flow Store version.

Environment variables:

  • INSPECT_FLOW_STORE - Store location (same as --store)

The flow store delete Command

Delete the entire Flow Store:

flow store delete

Skip confirmation prompt:

flow store delete -y

Environment variables:

  • INSPECT_FLOW_STORE - Store location (same as --store)

Advanced

Matching logs

When log matching is enabled, the Flow Store matches logs using a task identifier computed by Inspect AI. This identifier is a hash of your task configuration including:

  • Task name, model, solver, scorer, and task arguments
  • Model configuration (temperature, top_p, max_tokens, seed, etc.)
  • Task limits (message_limit, token_limit, time_limit, working_limit)
  • Approval and sandbox settings

Note that limit, sample_shuffle, retry parameters, and parallelism settings are NOT included—these are runtime parameters that don’t change what you’re evaluating.

Task definition determinism: All task definition methods (FlowTask with name/factory, @task decorated functions, raw Task objects) work with the Flow Store as long as they’re deterministic. Tasks with dynamic inputs—like timestamps, random data, or external API calls—will generate different identifiers on each run, preventing log reuse.

Invalidated logs: The Flow Store automatically skips logs marked as invalidated when selecting which log to reuse. If multiple matching logs exist, only non-invalidated logs with at least one completed sample are considered. Invalidated logs are only used if no other valid logs are available.

Missing or corrupted logs: If a log file referenced in the Flow Store no longer exists or cannot be read, the Flow Store automatically skips it and continues searching for other matching logs. This graceful error handling ensures that deleted or moved log files don’t break your workflow.

Log filters: Beyond task identifier matching, you can further control which logs are eligible for reuse using log filters. For example, you can restrict reuse to only successful evaluations or logs with specific tags. See Filtering for details.

Resuming incomplete samples

Flow uses Inspect AI’s Eval Set logic to resume incomplete logs. When log matching is enabled and the Flow Store finds a matching log, Flow copies it to your current log directory where Inspect’s eval_set automatically resumes from it. Logs already present in your log_dir are always resumed regardless of store settings.

Sample matching: For resumption to work, Inspect must match samples from the old log to the current run using sample IDs. There are two approaches:

  1. Explicit IDs (recommended): Include an id field in your dataset. This allows both deterministic and non-deterministic shuffling to work with resumption.

  2. Sequential IDs (automatic): Rely on Inspect’s auto-assigned sequential IDs. This works as long as your dataset order is stable. Note that if Inspect detects dataset.shuffle() was called, it will log a warning and skip sample reuse, running the full evaluation instead.