Inspect SWE

Software Engineering Agents for Inspect AI

Overview

The inspect_swe package makes software engineering agents like Claude Code and Codex CLI available as standard Inspect agents. For example, here we use the claude_code() agent as the solver in an Inspect task:

from inspect_ai import Task, task
from inspect_ai.dataset import json_dataset
from inspect_ai.scorer import model_graded_qa

from inspect_swe import claude_code

@task
def system_explorer() -> Task:
    return Task(
        dataset=json_dataset("dataset.json"),
        solver=claude_code(),
        scorer=model_graded_qa(),
        sandbox="docker",
    )

Inspect SWE agents are implemented using the Inspect sandbox_agent_bridge().

Agents run inside the sample sandbox and their model API calls are proxied back to Inspect. This means that you can use any model with Inspect SWE agents, and that features like token or time limits and log transcripts work as normal with the agents.

Getting Started

Install Inspect SWE from PyPI with:

pip install inspect-swe

Then, try out one or more of the available agents:

Agent Description
claude_code() Anthropic’s agentic coding tool Claude Code
codex_cli() OpenAI’s terminal-based coding agent Codex CLI