Projects

Overview

In some cases you’ll prefer to define your transcript source and filters, scanning model, and other configuration once for a project rather than each time you run scout scan. You can do this with a scout.yaml project file.

For example, if we have this project file in our working directory:

scout.yaml
transcripts: s3://weave-rollouts/
filter: 
  - task_set='cybench'
model: openai/gpt-5

Then we can run a scan with simply:

scout scan scanner.py 

Note that the filter field contains one or more SQL WHERE clauses that address fields in the transcript database.

You can also define the location of scanning results and other configuration in project files. For example:

scout.yaml
transcripts: s3://weave-rollouts/
filter: 
  - task_set='cybench'
scans: ./cybench-scans

max_processes: 4

model: openai/gpt-5
generate_config:
   temperature: 0.0
   reasoning_effort: minimal
   max_connections: 50

tags: [ctf, cybench]

Note that the filter will constrain any scan done within the project (i.e. filters applied to individual scans will be AND combined with this filter).

Note that scout.yaml project files are intended to be checked in to version control so do not contain secrets. See the section below on using environment files for details on handling secrets.

Scout View

When you run scout view from a project directory it uses the project settings to initialize the Transcripts and Results panes. You can also edit the project settings by clicking on the Project button at the top right:

Project Settings

Project files support all of the same options available to scan jobs. The table below describes the available configuration fields:

Field Type Description
name str Project name (defaults to directory name).
transcripts str Transcript source: local path, S3 URL, or list of sources.
filter str | list SQL WHERE clauses that filter based on fields in the transcript database. This will constrain any scan done within the project (i.e. filters applied to individual scans will be AND combined with this filter).
scans str Location for scan results (defaults to ./scans).
model str Model for scanning (e.g., openai/gpt-5).
model_base_url str Base URL for model API.
model_args dict | str Model creation args as a dictionary or path to JSON/YAML file.
generate_config dict Generation config (e.g., temperature)
model_roles dict Named model roles for use with get_model().
max_transcripts int Maximum concurrent transcripts to process (defaults to 25).
max_processes int Maximum concurrent processes for multiprocessing (defaults to 4).
limit int Limit the number of transcripts processed.
shuffle bool | int Shuffle transcript order. Pass an int to set a random seed.
tags list Tags to associate with scans (e.g., [ctf, cybench]).
metadata dict Arbitrary metadata to associate with scans.
log_level str Console log level (defaults to warning).
scanners list | dict Scanner specifications to include in all scans.
worklist list Transcript IDs to process for each scanner.
validation dict Validation sets to apply for specific scanners.

Local Config

In some cases you might want to provide local overrides to a shared project configuration file. You can do this by adding a scout.local.yaml file alongside your scout.yaml file. For example, here we override the main project file with a different model, max connections, and log level:

scout.local.yaml
model: openai/gpt-5-mini
generate_config:
   max_connections: 100
log_level: info

Be sure to add scout.local.yaml to your .gitignore so it isn’t checked in to version control.

Environment (.env)

While scout.yaml project files are intended to be checked into version control, you’ll often have secrets and credentials that should not be committed. Use a .env file for these values.

Common secrets to store in .env:

  • API keys: OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.
  • Access tokens: HF_TOKEN for Hugging Face datasets and models.
  • Cloud credentials: AWS credentials for S3 access

When you run scout scan or other Scout commands, the .env file in your working directory (or any parent directory) is automatically loaded. For example:

.env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...

Be sure to add .env to your .gitignore file to prevent accidentally committing secrets.

See the Inspect AI documentation on environment files for additional details on .env file handling

Scan Jobs

Projects share all configuration fields with scan jobs. When you run a scan, the project configuration is automatically merged with the scan job (whether defined in code via @scanjob or in a YAML/JSON config file).

The merge follows these rules:

  • For simple fields like scans, model, max_transcripts, etc., the project value is used only when the scan job doesn’t specify a value.

  • The input transcripts (transcripts, filter) are treated as an atomic unit. A scan job can fully override project transcripts but not e.g. add clauses to the filter.

  • The model configuration (model, model_base_url, model_args, generate_config) are also treated as an atomic unit. If the scan job specifies a model, all model-related configuration comes from the scan job. Otherwise, all model configuration comes from the project.

  • For collection fields like tags, metadata, scanners, worklist, and validation, values from both the project and scan job are combined. If there are key conflicts, the scan job value takes precedence.

For example, given this project:

scout.yaml
transcripts: s3://weave-rollouts/cybench
model: openai/gpt-5
tags: [production]

And this scan job:

scan.yaml
scanners:
  - file: scanner.py
tags: [safety-audit]

The effective configuration will use transcripts and model from the project, scanners from the scan job, and the merged tags [production, safety-audit].

Default Project

When you run scout scan or other Scout commands, the system automatically searches for a scout.yaml project file wihtin the current working directory.

If no project file is found, Scout uses the following defaults:

  • name: Project directory name
  • transcripts: ./transcripts (if that directory exists) or ./logs (if that directory exists)
  • scans: ./scans