Skip to content

Phase Transitions

Overview

Phase Transitions orchestrate the movement of issues through the readiness pipeline (research → architecture → grooming → ready) by processing completion claims from agents and managing judge-gated reviews.

Why it matters:

  • Validates artifact quality before advancing phases
  • Enforces judge review for sensitive changes (security, API, database)
  • Tracks phase metrics and rejection history
  • Prevents stale claims and contract version mismatches
  • Emits hooks for external systems to react to phase advancement

The system runs once per orchestrator tick and processes all pending phase completion findings atomically.


How It Works

1. Phase Completion Flow

2. Precondition Validation

Preconditions are checked in two stages with different failure behaviors:

Stage 1 — before contract load (silent failure, no needs-revision label):

PreconditionSourceCheck
Phase matchIssue labelsissue.labels.currentPhase === claim.completed_phase

If the phase does not match, markClaimProcessed is called with a stale-claim reason. No needs-revision label is added and rejection metrics are not incremented — the claim is simply discarded. A structured INFO log is emitted for observability.

Stage 2 — after contract load (rejection with needs-revision label):

PreconditionSourceCheck
Contract versionPhase contract + claimcontract.version === claim.contract_version
Next phasePhase contract + claimcontract.next_phase === claim.recommended_next_phase

If either fails, rejectClaim is called: the needs-revision label is added and the phase rejection counter is incremented. A structured WARN log is emitted with the rejection reason.

3. Artifact Resolution

Artifacts are checked in order of precedence:

  1. Worktree first{latestRun.worktreePath}/{artifactPath} (where agent worked)
  2. Project root fallback{projectRoot}/{artifactPath} (if worktree unavailable)

The orchestrator uses the latest agent run's worktreePath to find in-progress work.

4. Structural Validation

Required sections are extracted from markdown headings:

  • Heading: ## Problem Statement → key: problem_statement
  • Heading: ## Problem → matches problem_statement (fuzzy substring match)
  • Normalized comparison: spaces/hyphens converted to underscores, case-insensitive

Example (research phase requires):

  • problem_statement
  • relevant_codepaths
  • constraints
  • open_questions
  • risks
  • recommendation

5. Escalation Detection

Structural contracts can trigger automatic escalation to judge review. The supported trigger is:

yaml
escalate_if:
  - open_questions_present   # claim.open_questions.length > 0

Additional triggers are defined in the contract schema (api_boundary_changed, db_schema_changed, security_sensitive) but currently rely on artifact content being passed to the escalation check, which is not yet wired in the implementation.

Flow:

  • Contract mode = structural + escalate_if triggers match → force judge review
  • Contract mode = judge → always require judge
  • Contract mode = trust → no judge needed (rare)

6. Judge Review Workflow

When judge review is needed:

  1. Check for verdict — Look for PhaseVerdictFinding matching this claim
    • Prefer unprocessed verdict
    • Fall back to any verdict matching artifact hash (recovery from stale bugs)
  2. No verdict → Record in needsJudgeReview queue for judge dispatch
    • Judge will be dispatched in next tick
    • Claim stays skipped until verdict arrives
  3. Verdict arrived → Check verdict
    • approved → fall through to advance
    • rejected → reject claim and verdict, add needs-revision label, emit WARN log with verdict details

7. Phase Advancement

When all checks pass:

Atomic DB write (single transaction):

{
  status: targetPhase === 'ready' ? 'todo' : 'backlog',
  labels: swapPhaseLabel(currentPhase, targetPhase).filter(l => l !== 'needs-revision'),
  findings: mark all phase findings processed,
  phaseMetrics: {
    entered_phase_at: now,
    phase_rejection_count: 0,     // reset on advance
    last_completion_hash: artifact hash,
    last_verdict: 'approved' || null
  },
  artifacts: [...previous, { phase, path, status: 'approved', hash, revision }],
  updatedAt: now
}

Audit comment: System comment records phase transition, validation mode, artifact path.

Hook emission: on-phase-advance hook emitted with new issue state, project config, and phase.

8. Special Cases

Grooming → Ready transition:

  • Status set to todo (not backlog) — ready for worker immediate dispatch
  • Execution plan stored at artifact path (worker reads this)
  • Special system comment indicates worker should reference artifact for step-by-step instructions

Phase:ready (terminal phase):

  • No next phase in pipeline — orchestrator auto-creates a PR from the worktree diff
  • Issue moves to review status (not backlog)
  • PR creation is skipped if one already exists for the issue

Adaptive Phase Skipping:

  • Agents can recommend skipping phases via skip_to in complete_phase
  • Orchestrator validates the skip is forward-only and within the pipeline
  • Skipped phases get skip:<phase> labels for traceability
  • A phase_skip finding is recorded with the skip reason

Artifact Promotion:

  • If claim includes promote_to path, copy artifact from ticket/phase location to permanent location
  • Example: move research brief from docs/tickets/SYM-001/research.md to docs/research/feature-xyz.md
  • Recorded in audit comment

Key Components

FileResponsibility
phaseTransition.tsMain entry point; processes all transitions and expiry checks
phaseValidator.tsPrecondition validation, artifact path resolution, section checking
phaseArtifactHandler.tsClaim/verdict processing, rejection with metrics, findings updates
contractParser.tsParse YAML frontmatter, validate contracts at boot
escalation.tsDetect escalation triggers from claim and artifact content
artifactPromotion.tsCopy artifacts from ticket location to permanent location (optional)

Data Structures

Phase Completion Finding

typescript
interface PhaseCompletionFinding {
  id: string
  category: 'phase_completion'
  summary: string                    // e.g., "Research complete"

  // Phase metadata
  completed_phase: string            // 'research', 'architecture', 'grooming', 'ready'
  contract_version: number           // Must match contract version
  recommended_next_phase: string | null

  // Artifact reference
  artifact_path: string              // e.g., 'docs/tickets/SYM-001/research.md'
  artifact_hash: string              // Content hash for verdict matching
  artifact_revision?: number

  // Quality metadata
  confidence: 'low' | 'medium' | 'high'
  open_questions: string[]           // Can trigger escalation

  // Optional promotion
  promote_to: string | null          // Promote artifact to permanent location

  // Processing state
  processed: boolean                 // Marked after decision made
  rejection_reason?: string
  created_at: string
}

Phase Verdict Finding

typescript
interface PhaseVerdictFinding {
  id: string
  category: 'phase_verdict'
  summary: string                    // e.g., "Phase approved"

  // Judge decision
  phase: string
  artifact_hash: string              // Links to completion claim
  verdict: 'approved' | 'rejected'
  reason: string                     // Judge feedback
  fix_instructions?: string | null

  // Processing state
  processed: boolean
  created_at: string
}

Phase Metrics

typescript
interface PhaseMetrics {
  entered_phase_at: string           // ISO timestamp
  phase_rejection_count: number      // Resets on advance, increments on rejection
  last_completion_hash: string | null
  last_verdict: 'approved' | 'rejected' | null
}

Issue Artifact

typescript
interface IssueArtifact {
  phase: string                      // Which phase produced this
  path: string                       // Where it's stored
  status: 'approved' | 'rejected'    // Completion status
  revision: number                   // Attempt count for this phase
  hash: string                       // Content hash
}

Phase Contract Format

Phase contracts are YAML-frontmatter-delimited markdown files in prompts/phases/. All four phases have contract files: research.md, architecture.md, grooming.md, ready.md.

yaml
---
phase: research
contract_version: 1
dispatch_profile: researcher
model: sonnet
artifact_type: research_brief
artifact_path: "docs/tickets/{identifier}/research.md"

required_sections:
  - problem_statement
  - relevant_codepaths
  - constraints
  - open_questions
  - risks
  - recommendation

done_when:
  - artifact_exists
  - required_sections_present
  - next_phase_recommended

next_phase: architecture

validation:
  mode: structural
  structural_checks:
    - artifact_exists
    - required_sections_present
  escalate_if:
    - security_sensitive
---

[Contract body — instructions for agent]

Contract Fields

FieldTypePurpose
phasestringWhich phase (research, architecture, grooming, ready)
contract_versionnumberClaim must match to prevent stale processing
dispatch_profilestringWhich agent type handles this (researcher, architect, planner, worker)
modelstringOptional model override (e.g., sonnet for cheaper phases)
artifact_typestringDescriptive type for UI (research_brief, design_doc, execution_plan)
artifact_pathstringWhere artifact goes; {identifier} replaced with issue ID
required_sectionsstring[]Markdown ## headings that must exist
done_whenstring[]Descriptive list of completion criteria
next_phasestring | nullPhase to advance to (null = terminal phase)
validation.mode'structural' | 'judge' | 'trust'How strict to be
validation.structural_checksstring[]Which checks to run
validation.escalate_ifstring[]Triggers for automatic judge review

Needs-Revision Expiry

Separate mechanism: processNeedsRevisionExpiry()

Issues stuck with needs-revision label too long (default 30 min) are moved to todo for re-dispatch (circuit breaker pattern):

  1. Check if issue has needs-revision label
  2. Don't expire if there's an unprocessed completion claim (agent is retrying)
  3. Check age: now - updatedAt > maxAgeMs
  4. If expired: move to todo, remove needs-revision, post system comment

Purpose: Prevent agents from retrying forever without human intervention.


Design Decisions

Precondition Validation Before Artifact Check

Validate phase match, contract version, and next phase BEFORE reading files. This catches stale claims quickly (phase changed by another agent), avoids I/O for invalid claims, and surfaces spec mismatches clearly rather than as confusing artifact errors.

Comprehensive Rejection Logging

All rejection paths emit structured logs with consistent context (identifier, phase, claim_id). INFO level for benign stale claims, WARN level for operational issues requiring investigation. Enables debugging failed transitions and tuning contract requirements based on which sections agents frequently miss.

Escalation As Optional Feature

Escalation triggers are declarative in the contract, optional per phase. Allows lightweight validation for straightforward changes while sensitive changes can be automatically escalated regardless of mode.

Judge Workflow Allows Stale Verdict Recovery

If no unprocessed verdict, fall back to any verdict matching artifact hash. Handles the edge case where a verdict was processed but the claim wasn't (race condition), preventing duplicate judge dispatch for the same artifact.

Atomic Phase Advancement

Single DB write for all state changes (status, labels, metrics, artifacts, findings). Prevents partial state where a claim is marked processed but the phase didn't advance.

Rejection Increments Phase Metrics, Not Total Runs

PhaseMetrics.phase_rejection_count resets on phase advance. Rejection count is specific to the current phase (useful for UX). The exit handler tracks total runs separately for the circuit breaker.

Grooming → Ready Sets Status to todo

phase:ready means "ready for worker" — should enter the execution pipeline immediately. todo tells the worker dispatcher this issue is dispatchable now. All other phase advances set status to backlog.

Contract Version Must Exact Match

Claim contract version must exactly match the loaded contract version. Catches orchestrator upgrades that changed the phase spec and forces agents to retry with the new contract.

Version validation is enforced at two layers:

  1. MCP tool layer (completePhase.ts) — fast in-session feedback when agent calls complete_phase with wrong version
  2. Orchestrator layer (phaseValidator.ts) — safety net during transition processing

Missing version in claim (v0) will be rejected against any versioned contract (v1+). Agents are instructed in each phase contract to include contract_version from the frontmatter in their complete_phase call.


Known Limitations

Stale Worktree References: If an agent exits uncleanly, the worktree may not be cleaned up. Phase transitions fall back to project root (safe, but unexpected). Worktree GC runs periodically to address this.

Escalation Triggers Are Partially Wired: The api_boundary_changed, db_schema_changed, and security_sensitive triggers are defined in the contract schema and implemented in escalation.ts, but require artifact content to be passed at call time. Only open_questions_present (which operates on the claim) fires reliably today.

Missing Sections Checked By Substring Match: findMissingSections() uses fuzzy substring matching (e.g., "problem" matches "problem_statement"). May pass validation for badly formatted artifacts.

Early Content Validation in MCP Tool: The complete_phase MCP tool validates artifact content (minimum length, Markdown format, required sections) before recording the claim. This gives agents immediate feedback in the same turn. The orchestrator still validates independently as defense-in-depth.

Judge Verdict Lookup Uses Artifact Hash: If an agent revises an artifact but resubmits the same claim, a verdict for the old artifact could be reused.

Needs-Revision Expiry Time Is Global: Same 30-minute timeout for all issues regardless of complexity.


Integration Points

Called From

  • orchestrator.ts (tick loop) — calls processPhaseTransitions() and processNeedsRevisionExpiry() each tick

Calls To

  • contractParser.ts — loads and parses phase contracts
  • phaseValidator.ts — validates preconditions, resolves artifacts
  • escalation.ts — checks escalation triggers
  • phaseArtifactHandler.ts — marks claims/verdicts, tracks metrics
  • dispatcher.ts — consumes needsJudgeReview list for judge dispatch

Reads From

  • issues table — finds issues with unprocessed phase_completion findings
  • agentRuns table — gets latest run's worktree path for artifact resolution
  • projects table — fetches project config for hook emission

Writes To

  • issues table — updates status, labels, findings, metrics, artifacts
  • issueComments table — audit comments for transitions and rejections

Emits

  • on-phase-advance hook — when an issue advances phase
  • phase_advance, phase_reject, phase_skip, status_change, label_add, label_remove, pr_created issue events

Testing

Phase transitions are tested in tests/server/orchestrator/phaseTransition.test.ts:

  • Structural validation: artifact existence, required sections
  • Precondition checks: phase mismatch, contract version mismatch
  • Judge review: escalation triggers, verdict waiting
  • Artifact promotion: moving artifacts to permanent location
  • Needs-revision expiry: stale label cleanup
  • Metrics: rejection counting, reset on advance
bash
npx vitest run tests/server/orchestrator/phaseTransition.test.ts