/ultra-review - Ultra Review System
Orchestrates parallel code review using specialized agents. All findings written to JSON files to prevent context window pollution.
Workflow Tracking (MANDATORY)
On command start, create tasks for each major step using TaskCreate:
| Step | Subject | activeForm | |------|---------|------------| | 1 | Setup & Scope Detection | Detecting diff scope... | | 2 | Agent Selection | Selecting review agents... | | 3 | Background Execution | Running review agents... | | 4 | Wait & Coordinate | Waiting for agent results... | | 5 | Report to User | Generating review report... |
Before each step: TaskUpdate → status: "in_progress"
After each step: TaskUpdate → status: "completed"
On context recovery: TaskList → resume from last incomplete step
Usage
/ultra-review # Full review (smart skip based on diff content)
/ultra-review all # Force ALL 6 agents, no auto-skip (pre-merge gate)
/ultra-review quick # Quick review (review-code only)
/ultra-review security # Security focus (review-code + review-errors)
/ultra-review tests # Test quality focus (review-tests only)
/ultra-review recheck # Re-check only P0/P1 files from last session
/ultra-review delta # Review only changes since last review session
/ultra-review <agents...> # Custom: e.g., /ultra-review tests errors types
Scope Options (combinable with any mode)
/ultra-review --pr 123 # Review PR #123 diff (base...head)
/ultra-review --range main..HEAD # Review specific commit range
/ultra-review --range abc123 # Review single commit
/ultra-review security --pr 42 # Security review scoped to PR #42
Orchestration Flow
Phase 1: Setup & Scope Detection
# 1. Determine session context
BRANCH = git branch --show-current # e.g., "feat/task-3-auth"
ITER = count existing sessions matching this branch + 1 # iteration number
# 2. Create session directory with full context
SESSION_ID = "<YYYYMMDD-HHmmss>-<branch>-iter<N>"
# e.g., "20260214-103000-feat-task-3-auth-iter2"
SESSION_PATH = ".ultra/reviews/{SESSION_ID}/"
mkdir -p SESSION_PATH
# 3. Update session index (placeholder — verdict filled in Phase 5)
# Append entry to .ultra/reviews/index.json with verdict="pending"
# See Session Index below for full schema
# 4. Cleanup (see Lifecycle Management below)
# 5. Detect diff scope (priority order)
Scope resolution priority:
| Flag | Diff Command | DIFF_RANGE |
|------|-------------|------------|
| --pr NUMBER | gh pr diff NUMBER --name-only | PR base...head |
| --range RANGE | git diff RANGE --name-only | Specified range |
| (default) | git diff --name-only HEAD + git diff --cached --name-only | HEAD (unstaged + staged) |
PR scope details (--pr NUMBER):
# Get PR metadata for session naming
gh pr view NUMBER --json headRefName,baseRefName,title
# Get changed files
gh pr diff NUMBER --name-only
# Set DIFF_RANGE to base...head for agents
DIFF_RANGE="$(gh pr view NUMBER --json baseRefName -q .baseRefName)..HEAD"
Range scope details (--range RANGE):
git diff RANGE --name-only
DIFF_RANGE="RANGE"
Phase 2: Agent Selection
Based on the mode argument and file classification:
Mode: full (default)
Select all applicable agents based on diff content:
| Agent | Skip Condition | Warning | |-------|---------------|---------| | review-code | Never skip | - | | review-tests | No test files in diff AND no source files without tests | WARN: "Code changed without test updates" | | review-errors | No error handling code detected | - | | review-design | No type definitions AND no function-level changes in diff | - | | review-comments | No comment changes in diff | - |
Auto-skip detection commands:
# Has type definitions?
grep -l "interface \|type \|class \|enum " <diff_files>
# Has comment changes?
git diff HEAD -- <diff_files> | grep "^[+-].*//\|^[+-].*\*\|^[+-].*#"
# Has test files?
echo "<diff_files>" | grep -E "\.(test|spec)\.(ts|tsx|js|jsx)$|test_.*\.py$|.*_test\.go$"
# Has error handling?
grep -l "try\|catch\|\.catch\|throw\|Error(" <diff_files>
Mode: all → Force ALL 6 agents, no auto-skip. Use for pre-merge gates (/ultra-dev).
Mode: quick → review-code only
Mode: security → review-code + review-errors
Mode: tests → review-tests only
Mode: recheck → See Recheck Logic below
Mode: delta → See Delta Logic below
Mode: custom → Parse agent names from arguments
Phase 3: Background Execution (Zero Context Pollution)
Launch ALL selected agents in background mode (run_in_background: true) using multiple Task tool calls in a single message. This prevents agent output from polluting the main conversation context.
Each agent receives this prompt template:
You are running as part of /ultra-review pipeline.
SESSION_PATH: {SESSION_PATH}
OUTPUT_FILE: {agent-name}.json
DIFF_FILES: {comma-separated file list}
DIFF_RANGE: {resolved DIFF_RANGE from Phase 1}
Review the changed files and write your findings as JSON to:
{SESSION_PATH}/{agent-name}.json
Follow the ultra-review-findings-v1 schema exactly. Read the schema from:
~/.claude/skills/ultra-review/references/unified-schema.md
Only report findings with confidence >= 75.
After writing the JSON file, output one line: "Wrote N findings (P0:X P1:X P2:X P3:X) to <filepath>"
Important: Use subagent_type matching the agent name (e.g., review-code, review-tests, etc.).
Important: Set run_in_background: true on every Task call. Do NOT read TaskOutput — this avoids injecting agent results into main context.
Maximum 12 findings per agent. If you find more, report only the top 12 by severity (P0 first), then confidence. Note total count in your output line: "Wrote 12/23 findings..."
CRITICAL PROHIBITION: After launching background agents, you MUST:
- Call
review_wait.pyIMMEDIATELY — do NOT process idle notifications - NEVER call TaskOutput for any review agent — their output goes to JSON files only
- Ignore all agent idle/completion messages between launch and wait script return
- The ONLY information path from agents is: wait script → Read SUMMARY.json
Violation of these rules causes context overflow. The review agents write to files; you read from files.
Phase 4: Wait & Coordinate (File-Based)
Step 4a: Wait for agents — Use Bash to block until all agents finish writing:
python3 ~/.claude/skills/ultra-review/scripts/review_wait.py {SESSION_PATH} agents {AGENT_COUNT}
This polls for review-*.json files every 2 seconds. Timeout: 5 minutes.
Output is structured JSON on stdout:
status: "complete"→ all agents done, proceed normallystatus: "partial"→ timeout, log missing agents, pass onlyagents_doneto coordinator, note partial review in SUMMARY- Exit code 1 (0 agents) → skip review, warn user, allow ultra-dev to continue
Step 4b: Launch coordinator in background:
Launch review-coordinator with run_in_background: true:
You are the review coordinator.
SESSION_PATH: {SESSION_PATH}
AGENTS_RUN: {list of agents from wait output}
Read all review-*.json files in the session directory, deduplicate findings,
compute verdict, and generate SUMMARY.md + SUMMARY.json.
Step 4c: Wait for coordinator:
python3 ~/.claude/skills/ultra-review/scripts/review_wait.py {SESSION_PATH} summary
Output is a single verdict line (~15 tokens).
Phase 5: Report to User
After review_wait.py summary returns:
- Read SUMMARY.json (one Read call, ~500 tokens) to get verdict and counts
- Update index.json — fill in the placeholder entry with actual verdict, p0, p1, total from SUMMARY.json
- Present a concise summary:
## Review Complete
**Verdict**: {APPROVE | COMMENT | REQUEST_CHANGES}
**Findings**: X total (P0: A, P1: B, P2: C, P3: D)
**Agents**: N ran, M succeeded
### Top Findings
1. [P0] {title} - {file}:{line}
2. [P1] {title} - {file}:{line}
3. ...
(show up to 5 most critical findings)
Full report: {SESSION_PATH}/SUMMARY.md
Then ask the user:
- Fix all - Fix all P0 and P1 issues
- Fix P0 only - Fix only critical issues
- View full report - Read SUMMARY.md
- Skip - No fixes needed
Recheck Logic
When mode is recheck:
- Find the most recent session for the current branch in
index.json - Read its
SUMMARY.json - Extract files with P0 or P1 findings
- Run review-code only on those specific files
- Compare with previous findings to show delta (resolved / new / unchanged)
Delta Logic
When mode is delta:
- Find the most recent session for the current branch in
index.json - Read its
SUMMARY.jsonto get the scope of last review - Identify files changed since the last review:
# If last session was commit-based, diff from that commit git diff --name-only <last-reviewed-commit>..HEAD # If last session was working-tree-based, diff current unstaged + staged git diff --name-only HEAD - Exclude files that have NOT changed since last review (already reviewed)
- Run full agent set only on the newly changed files
- In coordinator, merge with previous session findings (carry forward unresolved P0/P1)
Use case: Iterative development — fix issues, add code, only review what's new.
Session Index
A lightweight index file tracks all sessions for cross-referencing:
File: .ultra/reviews/index.json
{
"sessions": [
{
"id": "20260214-103000-feat-task-3-auth-iter1",
"branch": "feat/task-3-auth",
"iteration": 1,
"mode": "full",
"timestamp": "2026-02-14T10:30:00Z",
"verdict": "REQUEST_CHANGES",
"p0": 2,
"p1": 3,
"total": 12,
"parent": null
},
{
"id": "20260214-113000-feat-task-3-auth-iter2",
"branch": "feat/task-3-auth",
"iteration": 2,
"mode": "recheck",
"timestamp": "2026-02-14T11:30:00Z",
"verdict": "COMMENT",
"p0": 0,
"p1": 1,
"total": 4,
"parent": "20260214-103000-feat-task-3-auth-iter1"
}
]
}
Fields:
| Field | Description |
|-------|-------------|
| id | Session directory name |
| branch | Git branch at review time |
| iteration | Nth review for this branch (auto-incremented) |
| mode | Review mode used (full/quick/security/recheck/delta) |
| timestamp | ISO 8601 |
| verdict | APPROVE / COMMENT / REQUEST_CHANGES |
| p0, p1, total | Finding counts from SUMMARY.json |
| parent | Previous session ID for this branch (forms iteration chain) |
Branch-scoped queries:
recheck/delta: filterindex.jsonbybranch == current branch, take latest- Avoids cross-task pollution when multiple branches are active
Lifecycle Management
Cleanup strategy (runs at Phase 1 of every /ultra-review):
| Rule | Action | |------|--------| | Session > 7 days AND verdict = APPROVE | Delete directory + remove from index | | Session > 7 days AND verdict = COMMENT | Delete directory + remove from index | | Session > 7 days AND verdict = REQUEST_CHANGES | Keep (unresolved P0s should not be silently deleted) | | Session > 30 days (any verdict) | Delete directory + remove from index | | Per-branch limit: > 5 sessions | Delete oldest APPROVE/COMMENT sessions for that branch |
Implementation:
# In Phase 1, after creating session directory:
index = read index.json (or create if missing)
# 1. Remove stale sessions
now = current_time()
for session in index.sessions:
age_days = (now - session.timestamp).days
if age_days > 30:
delete(session)
elif age_days > 7 and session.verdict != 'REQUEST_CHANGES':
delete(session)
# 2. Per-branch cap (keep latest 5)
branch_sessions = [s for s in index.sessions if s.branch == current_branch]
if len(branch_sessions) > 5:
# Sort by timestamp, keep latest 5, delete rest (except REQUEST_CHANGES)
to_remove = sorted(branch_sessions, key=timestamp)[:-5]
for s in to_remove:
if s.verdict != 'REQUEST_CHANGES':
delete(s)
# 3. Write updated index
write index.json
Fix Flow
When user selects "Fix all" or "Fix P0 only" after review:
Step 0: Context Reset (if verdict is REQUEST_CHANGES):
- Run
/compactto clear TDD cycle history from context - After compact, read
.ultra/workflow-state.jsonto restore task context - Read
SUMMARY.jsonfresh to get actionable findings - Then proceed with fixes below
- Read SUMMARY.json — extract actionable P0/P1 items, sort by severity (P0 first)
- Group by file — organize findings per file
- For each file with findings (with circuit breaker tracking):
a. Read the file once
b. Apply ALL P0/P1 fixes for that file (batch within same file)
c. Run the RELEVANT test file only:
- Detect:
foo.ts→foo.test.ts/foo.spec.ts - Run:
npx jest <test-file> --bailorpytest <test-file> -xd. If test passes → mark findings as addressed, move to next file e. If test fails from THIS fix → increment per-file failure counter, revert, try alternative approach f. If test fails from pre-existing issue → note it, continue g. Circuit Breaker: If same file has 3 consecutive fix failures (test fails after each attempt): - Display: "Circuit breaker: {filename} — 3 consecutive fix failures. Likely architectural issue, not isolated bug."
- Use AskUserQuestion:
- A) "Skip this file, continue with others"
- B) "Manually inspect this file" (exit auto-fix, user takes over)
- C) "Abandon this fix round" (end fix flow, write UNRESOLVED.md) h. Global Circuit Breaker: If 3+ files trigger per-file circuit breaker:
- Display: "Multiple files failing repeatedly — suggests systematic issue, not isolated bugs."
- Recommend ending fix flow, write all unresolved findings to UNRESOLVED.md with tag
ARCHITECTURAL_CONCERN
- Detect:
- After all files fixed: run FULL test suite once as validation
- Update verdict — after all P0 fixes applied and tests pass:
This recalculates the verdict from current P0/P1 counts and updates both SUMMARY.json and index.json. Without this step, pre_stop_check will block on stale REQUEST_CHANGES verdict.python3 ~/.claude/skills/ultra-review/scripts/review_verdict_update.py {SESSION_PATH} - Optionally:
/ultra-review recheck(counts as iteration, respects max 2 cap)
Important: The main agent (not review agents) performs fixes. Review agents are read-only by design.
Integration with ultra-dev
This skill integrates with the TDD workflow:
- After
tdd-runnerconfirms tests pass - Before
commitor PR creation - Quick mode for iterative development
- Full mode for pre-merge gate
- Recheck mode after fixing review findings
- Delta mode for incremental review during long development sessions
Error Handling
- If no diff detected: inform user, suggest using
--rangeor--prto specify scope - If
--prfails (no gh CLI or no PR): fall back to default diff and warn user - If an agent fails to write JSON: note in coordinator input, skip that agent's findings
- If coordinator fails: fall back to reading individual JSON files and presenting raw findings
- If no agents selected (all skipped): inform user with skip reasons
Directory Structure
.ultra/reviews/
├── index.json # Session index (all sessions metadata)
├── 20260214-103000-feat-task-3-auth-iter1/ # First review
│ ├── review-code.json
│ ├── review-tests.json
│ ├── review-errors.json
│ ├── review-design.json
│ ├── review-comments.json
│ ├── SUMMARY.json
│ └── SUMMARY.md
└── 20260214-113000-feat-task-3-auth-iter2/ # Recheck after fixes
├── review-code.json
├── SUMMARY.json
└── SUMMARY.md