Coding Agents
Coding agent targets evaluate assistants that run against a real workspace. They
use the target shape id, provider, runtime, and config. The target id
is AgentV’s stable selection and artifact identity. provider names the
adapter or control boundary. runtime names where the target runs. Provider
settings live under config.
Use defaults.grader, CLI --grader / --grader-target, or an
evaluator-specific target override for LLM-based grading. Grader selection is
separate from the coding-agent target, so target definitions do not carry a
grader field.
targets: - id: codex-local provider: codex-app-server runtime: host config: command: ["codex", "app-server"] model: gpt-5-codex reasoning_effort: high
graders: - id: openai-grader provider: openai config: model: gpt-5-mini
defaults: target: codex-local grader: openai-grader
execution: max_concurrency: 3Process-backed coding-agent providers use config.command as a non-empty argv
array. The first element is the executable or shim, and the remaining elements
are arguments.
Runtime placement
Section titled “Runtime placement”| Runtime | Boundary | Best fit |
|---|---|---|
host | Runs the installed CLI or child runner on the current machine. | Local research, subscription OAuth, and evaluating the same profile an engineer uses manually. |
profile | Runs a host process with isolated home/config/env such as HOME, CODEX_HOME, or temp dirs. | Cleaner local evals without container cost. |
sandbox | Runs through a separate substrate such as Docker or a managed sandbox. | CI, reproducibility, untrusted tasks, and stronger filesystem containment. |
Sandbox mode does not inherit host credentials. For CI, prefer API keys or explicit secrets injected through sandbox configuration. Subscription OAuth can be evaluated only by intentionally mounting or seeding the profile directory the agent needs, which trades portability for fidelity to the local agent setup.
targets: - id: codex-clean-profile provider: codex-cli runtime: mode: profile codex_home: .agentv/profiles/codex-clean tmp_dir: .agentv/tmp/codex-clean config: command: ["codex", "exec", "--json"] model: gpt-5-codex sandbox_mode: workspace-write approval_policy: nevertargets: - id: codex-ci-sandbox provider: codex-cli runtime: mode: sandbox engine: docker image: ghcr.io/acme/codex-agent:sha256 workdir: /workspace mounts: - source: ./workspace target: /workspace access: rw - source: ./.agentv/results target: /results access: rw secrets: OPENAI_API_KEY: ${{ OPENAI_API_KEY }} config: command: ["codex", "exec", "--json"] model: gpt-5-codex timeout_seconds: 300Provider tradeoffs
Section titled “Provider tradeoffs”| Provider | Boundary | Transcript and isolation notes |
|---|---|---|
codex-app-server | Codex app-server subprocess. | Preferred Codex path for rich protocol events, session control, cancellation, and structured transcripts. |
codex-cli | Codex CLI subprocess. | Best for simple local/CI process isolation and evaluating an installed Codex shim or profile. |
codex-sdk | Codex SDK in an AgentV child runner. | Explicit SDK path. SDK crashes, malformed child output, and missing optional SDK packages are target execution errors. |
pi-rpc | Pi launched in RPC mode over stdio. | Preferred rich Pi control boundary; AgentV launches the configured command with RPC mode when needed. |
pi-cli | Pi CLI subprocess. | Simple process boundary; transcript richness depends on Pi CLI output. |
pi-sdk | Pi SDK in an AgentV child runner. | Explicit SDK path for SDK-native events with child-process isolation. |
claude-cli | Claude CLI subprocess. | Default Claude path; captures structured stream output when available. |
claude-sdk | Claude Agent SDK in an AgentV child runner. | Explicit SDK path; useful when SDK-native events matter more than matching a local CLI invocation. |
copilot-cli | Copilot CLI subprocess/protocol path. | Active Copilot eval run through the installed process. |
copilot-log | Passive Copilot session-log reader. | Zero-cost transcript grading for existing sessions; it does not run a new agent. |
copilot-sdk | Copilot SDK in an AgentV child runner. | Explicit SDK path with child-process isolation. |
Every coding-agent provider returns a structured target execution envelope. Run bundles preserve target id, provider kind, runtime mode, command argv, cwd, stdout/stderr, transcripts or logs, final output when available, timing, timeouts, exit codes, signals, and partial artifacts on failure.
Use codex-app-server when you want rich protocol control:
targets: - id: codex-local provider: codex-app-server runtime: host config: command: ["codex", "app-server"] model: gpt-5-codex reasoning_effort: high model_verbosity: mediumUse codex-cli when a simple CLI boundary is enough or when you want an
operator-specific shim:
targets: - id: codex-eng provider: codex-cli runtime: host config: command: ["codex-eng", "exec", "--json"] model: ${{ CODEX_MODEL }}Use codex-sdk only when you intentionally want the Codex SDK path:
targets: - id: codex-sdk-isolated provider: codex-sdk runtime: host config: model: gpt-5-codexCommon Codex config fields include command, model, reasoning_effort,
model_verbosity, base_url, api_key, api_format, sandbox_mode,
approval_policy, cwd, timeout_seconds, log_dir, stream_log, and
system_prompt.
Use pi-rpc for the rich stdio/RPC boundary:
targets: - id: pi-rpc-local provider: pi-rpc runtime: host config: command: ["pi"] model: gpt-5-codex thinking: mediumUse pi-cli for simple subprocess execution:
targets: - id: pi-cli-local provider: pi-cli runtime: mode: profile home: .agentv/profiles/pi-local config: command: ["pi"] subprovider: openrouter model: ${{ OPENROUTER_MODEL }} api_key: ${{ OPENROUTER_API_KEY }}Use pi-sdk only when you intentionally want the SDK path:
targets: - id: pi-sdk-isolated provider: pi-sdk runtime: host config: subprovider: openai-codex model: gpt-5.5 thinking: mediumPi config fields include command, subprovider, model, thinking,
tools, api_key, base_url, cwd, timeout_seconds, log_dir,
stream_log, and system_prompt. With pi-cli, the built-in OpenAI provider
does not expose a CLI base-url option; use a Pi custom provider name or Pi’s
Azure provider path for custom gateways.
Claude
Section titled “Claude”targets: - id: claude-local provider: claude-cli runtime: host config: command: ["claude"] model: claude-sonnet-4-20250514 max_turns: 10Use claude-cli when you want AgentV to spawn the same Claude CLI a user runs
locally. Use claude-sdk only when you intentionally want the Claude Agent SDK
path:
targets: - id: claude-sdk-isolated provider: claude-sdk runtime: host config: model: claude-sonnet-4-20250514 max_turns: 10Claude config fields include command, model, cwd, timeout_seconds,
max_turns, max_budget_usd, bypass_permissions, log_dir, stream_log,
and system_prompt.
Copilot
Section titled “Copilot”targets: - id: copilot-local provider: copilot-cli runtime: host config: command: ["copilot"] model: gpt-5-miniRoute Copilot through an OpenAI-compatible endpoint:
targets: - id: copilot-openai provider: copilot-cli runtime: host config: command: ["copilot"] subprovider: openai base_url: ${{ OPENAI_ENDPOINT }} api_key: ${{ OPENAI_API_KEY }} api_format: responsesRead an existing Copilot session log without running a new agent:
targets: - id: copilot-session-log provider: copilot-log runtime: host config: discover: latestUse copilot-sdk only when you intentionally want the SDK path:
targets: - id: copilot-sdk-isolated provider: copilot-sdk runtime: host config: model: gpt-5-miniCopilot config fields include command, model, cwd, timeout_seconds,
subprovider, base_url, api_key, bearer_token, api_version,
api_format, log_dir, stream_log, system_prompt, and session-log fields
such as discover, session_id, and session_dir for copilot-log.
File inputs
Section titled “File inputs”Agent providers receive file inputs as paths, not inline file content. The
prompt includes a preread block with file:// URIs pointing to absolute paths
on disk, then the user query references each file:
input: - role: user content: - type: file value: ./src/example.ts - type: text value: Review this codeThe agent receives a prompt like:
Read all input files:* [example.ts](file:///abs/path/src/example.ts).
If any file is missing, fail with ERROR: missing-file <filename> and stop.Then apply system_instructions on the user query below.
[[ ## user_query ## ]]<file: path="./src/example.ts">Review this codeLLM providers receive file content inline instead; see LLM providers.
Mock provider
Section titled “Mock provider”For deterministic harness checks without a real provider:
targets: - id: mock-target provider: mock runtime: host config: response: ok