Skip to content

Agent Lifecycle

Overview

Symphony isolates every agent run in a git worktree — a separate, full-copy clone of the codebase. When the orchestrator decides to dispatch an issue, it creates a worktree, builds a layered system prompt with workspace boundary constraints, spawns a Claude CLI subprocess, and monitors it until completion. When the agent exits, the orchestrator captures its output, auto-commits any uncommitted changes, updates the issue status, and schedules a retry or cleanup.

This isolation model lets multiple agents work concurrently without filesystem collisions, keeps the parent repository untouched, and makes cleanup a single git worktree remove.


Dispatch Flow

The end-to-end sequence from orchestrator tick to agent start:


Worktree Setup

Creation

workspaceManager.ts creates a git worktree in .symphony-workspaces/<sanitized-identifier>/. If a worktree already exists from a previous run it is reused (latest main branch is merged in). Stale worktree references from prior crashes are pruned before creation to avoid "already registered" errors.

The worktree is a full copy of the repository — all dependencies, configs, and test data are present.

Workspace Boundary Injection

The orchestrator never writes a CLAUDE.md file into the worktree. Instead, workspace boundary constraints are reinforced through four layers (defense-in-depth):

  1. System promptgetWorkspaceBoundary() injects a ## CRITICAL: Workspace Boundary section with a FIRST ACTION: pwd verification instruction
  2. User prompt headerbuildUserPrompt() renders a > **Working in**: /path | **Branch**: symphony/ID line immediately after the issue title (worker and judge dispatches)
  3. Agent profiles — Every profile (worker.md, judge.md, researcher.md, etc.) includes a "Step 0: Verify Workspace" section as the first action
  4. Phase contracts — Every phase contract (ready.md, research.md, architecture.md, grooming.md) includes a "Workspace Context" section with pwd and git branch --show-current verification commands

This multi-layer approach ensures the worktree path stays visible even during long sessions where the system prompt drifts out of the context window. The approach:

  • Keeps the boundary invisible to git (no diff pollution, no merge conflicts)
  • Makes the boundary immutable — the agent cannot edit the file that constrains it
  • Avoids ambiguity between repo-level and worktree-level instructions

The boundary tells the agent its working directory, prohibits cd outside it, and requires all paths to be relative to the worktree root.

Configuration Copy

worktreeSetup.ts copies .env, .env.local, and symphony.config.json from the project root into the worktree on first dispatch. If the files are already present (agent state preserved across retries), the copy is skipped. This gives the agent the same secrets and environment as the parent without requiring shared mounts.


Agent Process

CLI Flags (Claude CLI provider)

claude -p \
  --system-prompt <path>          # layered identity + boundary prompt
  --model <model>                 # resolved from phase → profile → project → global
  --mcp-config <path>             # per-issue MCP server config
  --strict-mcp-config             # reject unknown MCP servers
  --dangerously-skip-permissions  # non-interactive execution
  --disable-slash-commands        # prevent slash-command side-effects
  --output-format stream-json     # NDJSON event stream on stdout
  --verbose                       # emit all event types
  <user-prompt>                   # issue context, inline

Process Spawning

The subprocess is launched with detached: true, making Claude CLI the leader of its own process group. This is essential for clean teardown: killing -pid sends SIGTERM to the entire group, including any child processes Claude spawns (vitest workers, npm scripts, bash subshells in tests).

Environment additions on top of the inherited parent env:

  • CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000
  • Augmented PATH including ~/.local/bin, /opt/homebrew/bin, /usr/local/bin — ensures binaries installed in non-login shell paths are found when the orchestrator runs from a GUI app

Output Capture

StreamHandling
stdoutFull buffer — Claude CLI NDJSON payload; also parsed line-by-line in real time
stderrRolling buffer capped at 50 KB — keeps the last 50 KB of output for debugging without unbounded memory growth

Stdout is parsed as NDJSON (--output-format stream-json --verbose). Each event line is passed through streamParser.ts, formatted into readable log lines ([init], [thinking], [tool_use], [text], [result]), and written to .tmp/logs/<runId>.log for live SSE streaming via /api/orchestrator/logs/stream.

MCP Config

writeMcpConfig() writes a JSON file at .tmp/mcp-<issueId>.json registering a single MCP server named symphony:

json
{
  "mcpServers": {
    "symphony": {
      "command": "npx",
      "args": ["tsx", "server/mcp/index.ts", "--issue-id", "<id>", "--db", "<path>", "--agent-type", "<profile>"]
    }
  }
}

The --agent-type parameter enables phase-scoped tool filtering — each agent only sees tools relevant to its role (e.g., judges see verdict tools but not create_pr; workers see implementation tools but not approve_pr). When omitted, all tools are exposed for backward compatibility.

--strict-mcp-config ensures the agent can only connect to this server and no others.

Timeout

Each project configures turn_timeout_ms (default ~600 000 ms / 10 minutes). If the timer fires before the agent exits, killProcessTree() is called and the agent is marked timedOut: true. The exit handler then treats the run as a failure and schedules a retry.

Process Tree Kill

killProcessTree() in agentRunner.ts:

  1. Sends SIGTERM to the entire process group via process.kill(-pid, 'SIGTERM')
  2. Falls back to direct proc.kill('SIGTERM') if the group kill fails (e.g., process is not a group leader)
  3. Schedules a forced SIGKILL to both group and process after 10 seconds if the process has not exited

Exit Handling

1. Capture and Cleanup

When the process closes, the exit handler immediately:

  • Removes the agent from the in-memory running map, releasing the concurrency slot
  • Calls autoCommitWorktree() — runs git add -A && git commit in the worktree; silently no-ops if nothing is uncommitted
  • Cleans up temp files (system prompt file, MCP config)
  • Parses token usage from the result event in the NDJSON stream (input tokens, output tokens, cache reads, cost)
  • Writes an agent_runs record with duration, exit code, token counts, and the formatted log output

2. Success vs. Failure

Exit code 0 is succeeded; timeout or non-zero exit is failed.

For successful exits the handler checks whether the agent actually changed the issue status via MCP tools:

Agent typeOutcomeAction
AnyStatus changed by MCP tool during runHonor the new status — no retry
Judge (phase review)Status unchangedLeave for phase transition processor
Judge (PR review)Status still review, no tool callStay in review — retry next cycle
WorkerStatus still in_progressContinuation retry with 1 s delay
PlannerNo sub-tasks createdRetry or move to terminal status

Judges must explicitly call approve_pr/reject_pr or approve_phase/reject_phase. An exit without a tool call is not treated as approval.

3. Failure Routing

Failed runs track consecutive failures in an in-memory map and check two limits:

if failures >= max_retries OR totalRuns >= max_retries:
    circuit breaker:
        all phases → status: todo  (re-dispatched based on current phase label)
    record system comment with error details
    clear failure counter
else:
    judges stay in 'review', others go to 'todo'
    schedule exponential backoff: 10s x 2^(attempt-1), capped at maxRetryBackoffMs
    increment failure counter

totalRuns >= maxRetries counts all runs for the issue, not just consecutive failures. This prevents a pattern of transient errors masking a systematic problem by resetting the counter on each success.

4. Retry Queue

The orchestrator maintains an in-memory RetryQueue with entries processed each tick:

Entry typeDelayTrigger
Continuation retry1 sWorker succeeded but did not change status
Exponential backoff10 s, 20 s, 40 s…Failed agent

Entries due (dueAtMs <= now) are processed in the "Process Retries" phase of the tick loop. The orchestrator re-dispatches the issue, sending the correct agent type based on its current phase label.


Cleanup

When an issue reaches a terminal status (done or cancelled), the worktree is removed:

  1. Read project config to resolve the workspace directory name
  2. Call removeWorktree() — deletes .symphony-workspaces/<identifier>/ and removes the associated git branch
  3. Log failures but do not retry — cleanup is best-effort

Worktrees are gitignored (.symphony-workspaces/), so orphaned directories from crashes do not affect the repository. git worktree prune can be run manually to reconcile stale references.


Key Files

FileResponsibility
server/orchestrator/agentRunner.tsSpawns subprocess, captures output streams, implements process tree kill
server/orchestrator/agentProvider.tsBuilds CLI flags per provider (Claude CLI, Codex CLI)
server/orchestrator/exitHandler.tsSuccess/failure routing, auto-commit, retry scheduling, worktree cleanup
server/orchestrator/workspaceManager.tsCreates/reuses git worktrees, merges latest main, removes on completion
server/orchestrator/worktreeSetup.tsInjects workspace boundary into system prompt, copies project configs
server/orchestrator/promptBuilder.tsBuilds base system prompt (agent identity + conventions)
server/orchestrator/retryQueue.tsIn-memory queue for continuation and exponential backoff retries
server/orchestrator/streamParser.tsParses NDJSON events from --output-format stream-json --verbose
server/orchestrator/usageParser.tsExtracts token counts and cost from the result stream event

  • Architecture — Three-process model and how the orchestrator fits in
  • Phase Pipeline — Phase completion claims and judge-gated transitions
  • Agent Types — Worker, judge, planner, researcher, and scanner profiles
  • MCP Tools — Tools agents use to signal status changes and completion