Coding Agents

Coding agent targets evaluate assistants that run against a real workspace. They use the target shape id, provider, runtime, and config. The target id is AgentV’s stable selection and artifact identity. provider names the adapter or control boundary. runtime names where the target runs. Provider settings live under config.

Use defaults.grader, CLI --grader / --grader-target, or an evaluator-specific target override for LLM-based grading. Grader selection is separate from the coding-agent target, so target definitions do not carry a grader field.

targets:
  - id: codex-local
    provider: codex-app-server
    runtime: host
    config:
      command: ["codex", "app-server"]
      model: gpt-5-codex
      reasoning_effort: high

graders:
  - id: openai-grader
    provider: openai
    config:
      model: gpt-5-mini

defaults:
  target: codex-local
  grader: openai-grader

execution:
  max_concurrency: 3

Process-backed coding-agent providers use config.command as a non-empty argv array. The first element is the executable or shim, and the remaining elements are arguments.

Runtime placement

Runtime	Boundary	Best fit
`host`	Runs the installed CLI or child runner on the current machine.	Local research, subscription OAuth, and evaluating the same profile an engineer uses manually.
`profile`	Runs a host process with isolated home/config/env such as `HOME`, `CODEX_HOME`, or temp dirs.	Cleaner local evals without container cost.
`sandbox`	Runs through a separate substrate such as Docker or a managed sandbox.	CI, reproducibility, untrusted tasks, and stronger filesystem containment.

Sandbox mode does not inherit host credentials. For CI, prefer API keys or explicit secrets injected through sandbox configuration. Subscription OAuth can be evaluated only by intentionally mounting or seeding the profile directory the agent needs, which trades portability for fidelity to the local agent setup.

targets:
  - id: codex-clean-profile
    provider: codex-cli
    runtime:
      mode: profile
      codex_home: .agentv/profiles/codex-clean
      tmp_dir: .agentv/tmp/codex-clean
    config:
      command: ["codex", "exec", "--json"]
      model: gpt-5-codex
      sandbox_mode: workspace-write
      approval_policy: never

targets:
  - id: codex-ci-sandbox
    provider: codex-cli
    runtime:
      mode: sandbox
      engine: docker
      image: ghcr.io/acme/codex-agent:sha256
      workdir: /workspace
      mounts:
        - source: ./workspace
          target: /workspace
          access: rw
        - source: ./.agentv/results
          target: /results
          access: rw
      secrets:
        OPENAI_API_KEY: ${{ OPENAI_API_KEY }}
    config:
      command: ["codex", "exec", "--json"]
      model: gpt-5-codex
      timeout_seconds: 300

Provider tradeoffs

Provider	Boundary	Transcript and isolation notes
`codex-app-server`	Codex app-server subprocess.	Preferred Codex path for rich protocol events, session control, cancellation, and structured transcripts.
`codex-cli`	Codex CLI subprocess.	Best for simple local/CI process isolation and evaluating an installed Codex shim or profile.
`codex-sdk`	Codex SDK in an AgentV child runner.	Explicit SDK path. SDK crashes, malformed child output, and missing optional SDK packages are target execution errors.
`pi-rpc`	Pi launched in RPC mode over stdio.	Preferred rich Pi control boundary; AgentV launches the configured command with RPC mode when needed.
`pi-cli`	Pi CLI subprocess.	Simple process boundary; transcript richness depends on Pi CLI output.
`pi-sdk`	Pi SDK in an AgentV child runner.	Explicit SDK path for SDK-native events with child-process isolation.
`claude-cli`	Claude CLI subprocess.	Default Claude path; captures structured stream output when available.
`claude-sdk`	Claude Agent SDK in an AgentV child runner.	Explicit SDK path; useful when SDK-native events matter more than matching a local CLI invocation.
`copilot-cli`	Copilot CLI subprocess/protocol path.	Active Copilot eval run through the installed process.
`copilot-log`	Passive Copilot session-log reader.	Zero-cost transcript grading for existing sessions; it does not run a new agent.
`copilot-sdk`	Copilot SDK in an AgentV child runner.	Explicit SDK path with child-process isolation.

Every coding-agent provider returns a structured target execution envelope. Run bundles preserve target id, provider kind, runtime mode, command argv, cwd, stdout/stderr, transcripts or logs, final output when available, timing, timeouts, exit codes, signals, and partial artifacts on failure.

Codex

Use codex-app-server when you want rich protocol control:

targets:
  - id: codex-local
    provider: codex-app-server
    runtime: host
    config:
      command: ["codex", "app-server"]
      model: gpt-5-codex
      reasoning_effort: high
      model_verbosity: medium

Use codex-cli when a simple CLI boundary is enough or when you want an operator-specific shim:

targets:
  - id: codex-eng
    provider: codex-cli
    runtime: host
    config:
      command: ["codex-eng", "exec", "--json"]
      model: ${{ CODEX_MODEL }}

Use codex-sdk only when you intentionally want the Codex SDK path:

targets:
  - id: codex-sdk-isolated
    provider: codex-sdk
    runtime: host
    config:
      model: gpt-5-codex

Common Codex config fields include command, model, reasoning_effort, model_verbosity, base_url, api_key, api_format, sandbox_mode, approval_policy, cwd, timeout_seconds, log_dir, stream_log, and system_prompt.

Pi

Use pi-rpc for the rich stdio/RPC boundary:

targets:
  - id: pi-rpc-local
    provider: pi-rpc
    runtime: host
    config:
      command: ["pi"]
      model: gpt-5-codex
      thinking: medium

Use pi-cli for simple subprocess execution:

targets:
  - id: pi-cli-local
    provider: pi-cli
    runtime:
      mode: profile
      home: .agentv/profiles/pi-local
    config:
      command: ["pi"]
      subprovider: openrouter
      model: ${{ OPENROUTER_MODEL }}
      api_key: ${{ OPENROUTER_API_KEY }}

Use pi-sdk only when you intentionally want the SDK path:

targets:
  - id: pi-sdk-isolated
    provider: pi-sdk
    runtime: host
    config:
      subprovider: openai-codex
      model: gpt-5.5
      thinking: medium

Pi config fields include command, subprovider, model, thinking, tools, api_key, base_url, cwd, timeout_seconds, log_dir, stream_log, and system_prompt. With pi-cli, the built-in OpenAI provider does not expose a CLI base-url option; use a Pi custom provider name or Pi’s Azure provider path for custom gateways.

Claude

targets:
  - id: claude-local
    provider: claude-cli
    runtime: host
    config:
      command: ["claude"]
      model: claude-sonnet-4-20250514
      max_turns: 10

Use claude-cli when you want AgentV to spawn the same Claude CLI a user runs locally. Use claude-sdk only when you intentionally want the Claude Agent SDK path:

targets:
  - id: claude-sdk-isolated
    provider: claude-sdk
    runtime: host
    config:
      model: claude-sonnet-4-20250514
      max_turns: 10

Claude config fields include command, model, cwd, timeout_seconds, max_turns, max_budget_usd, bypass_permissions, log_dir, stream_log, and system_prompt.

Copilot

targets:
  - id: copilot-local
    provider: copilot-cli
    runtime: host
    config:
      command: ["copilot"]
      model: gpt-5-mini

Route Copilot through an OpenAI-compatible endpoint:

targets:
  - id: copilot-openai
    provider: copilot-cli
    runtime: host
    config:
      command: ["copilot"]
      subprovider: openai
      base_url: ${{ OPENAI_ENDPOINT }}
      api_key: ${{ OPENAI_API_KEY }}
      api_format: responses

Read an existing Copilot session log without running a new agent:

targets:
  - id: copilot-session-log
    provider: copilot-log
    runtime: host
    config:
      discover: latest

Use copilot-sdk only when you intentionally want the SDK path:

targets:
  - id: copilot-sdk-isolated
    provider: copilot-sdk
    runtime: host
    config:
      model: gpt-5-mini

Copilot config fields include command, model, cwd, timeout_seconds, subprovider, base_url, api_key, bearer_token, api_version, api_format, log_dir, stream_log, system_prompt, and session-log fields such as discover, session_id, and session_dir for copilot-log.

File inputs

Agent providers receive file inputs as paths, not inline file content. The prompt includes a preread block with file:// URIs pointing to absolute paths on disk, then the user query references each file:

input:
  - role: user
    content:
      - type: file
        value: ./src/example.ts
      - type: text
        value: Review this code

The agent receives a prompt like:

Read all input files:
* [example.ts](file:///abs/path/src/example.ts).

If any file is missing, fail with ERROR: missing-file <filename> and stop.
Then apply system_instructions on the user query below.

[[ ## user_query ## ]]
<file: path="./src/example.ts">
Review this code

LLM providers receive file content inline instead; see LLM providers.

Mock provider

For deterministic harness checks without a real provider:

targets:
  - id: mock-target
    provider: mock
    runtime: host
    config:
      response: ok