MANDATORY RULES: VIOLATION IS FORBIDDEN
- Response language follows
languagesetting in.agents/oma-config.yamlif configured. - NEVER skip steps. Execute from Step 0 in order. Explicitly report completion of each step to the user before proceeding to the next.
- You MUST use MCP tools throughout the entire workflow. This is NOT optional.
- Use code analysis tools (
get_symbols_overview,find_symbol,find_referencing_symbols,search_for_pattern) for code exploration. - Use memory tools (read/write/edit) for progress tracking.
- Memory path: configurable via
memoryConfig.basePath(default:.serena/memories) - Tool names: configurable via
memoryConfig.toolsin.agents/mcp.json - Do NOT use raw file reads or grep as substitutes. MCP tools are the primary interface for code and memory operations.
- Use code analysis tools (
- Read the oma-coordination skill BEFORE starting. Read
.agents/skills/oma-coordination/SKILL.mdand follow its Core Rules. - Follow the context-loading guide. Read
.agents/skills/_shared/core/context-loading.mdand load only task-relevant resources.
Vendor Detection
Before starting, determine your runtime environment by following .agents/skills/_shared/core/vendor-detection.md.
The detected runtime vendor and each agent's target vendor determine how agents are spawned in Phase 2 (IMPL), Phase 3 (VERIFY), Phase 4 (REFINE), and Phase 5 (SHIP).
Phase 0: Initialization (DO NOT SKIP)
- Read
.agents/skills/oma-coordination/SKILL.mdand confirm Core Rules. - Read
.agents/skills/_shared/core/context-loading.mdfor resource loading strategy. - Read
.agents/skills/_shared/runtime/memory-protocol.mdfor memory protocol. - Read
.agents/skills/_shared/runtime/event-spec.mdfor L1 event protocol. - Use the
oma_emithelper documented in.agents/skills/_shared/runtime/event-spec.mdfor required L1 decisions. The helper wrapsoma state:emit. - Read
.agents/workflows/ultrawork/resources/multi-review-protocol.md(11 review guides) - Read
.agents/skills/_shared/core/quality-principles.md(4 principles) - Read
.agents/workflows/ultrawork/resources/phase-gates.md(gate definitions) - Record session start using memory write tool:
- Create
session-ultrawork.mdin the memory base path - Include: session start time, user request summary, workflow version (ultrawork)
- Create
Phase 1: PLAN (Steps 1-4)
Step 1: Create Plan & Review
// turbo Activate PM Agent to execute Steps 1-4:
- Analyze requirements.
- Define API contracts.
- Create a prioritized task breakdown.
- Execute Plan Review - Completeness (Step 2).
- Execute Meta Review (Step 3).
- Execute Over-Engineering Review (Step 4).
- Save plan to
.agents/results/plan-{sessionId}.json. - Create
task-board.mdin memory path for dashboard compatibility. - Use memory write tool to record plan completion.
Step 2: Plan Review (Completeness)
- Executed by PM Agent: Ensure requirements are fully mapped.
Step 3: Review Verification (Meta Review)
- Executed by PM Agent: Self-verify if the review was sufficient.
Step 4: Over-Engineering Review (Simplicity)
- Executed by PM Agent: Check for unnecessary complexity (MVP focus).
PLAN_GATE
- [ ] Plan documented
- [ ] Assumptions listed
- [ ] Alternatives considered
- [ ] Over-engineering review done
- [ ] User confirmation
On gate pass:
- Use memory edit tool to record phase completion in
session-ultrawork.md. - Emit the required L1 decision:
oma_emit "decision.made" '{"subject":"ultrawork.plan-approved","decision":"Proceed with the approved PLAN output.","rationale":"PLAN_GATE passed and the user confirmed scope."}' - Verify the required decision before Phase 2:
oma state:verify --workflow ultrawork --checkpoint plan-approved - Emit and verify the implementation scope lock before spawning implementation agents:
oma_emit "decision.made" '{"subject":"ultrawork.impl-plan-locked","decision":"Use the approved task decomposition for IMPL.","rationale":"PLAN output is locked before implementation agents are spawned."}' oma state:verify --workflow ultrawork --checkpoint impl-plan-locked
Gate failure → Return to Step 1
Phase 2: IMPL (Step 5)
Step 5: Implementation
// turbo Spawn Implementation Agents (Backend/Frontend/Mobile) in parallel.
Per-Agent Dispatch
Resolve the target vendor for each agent from .agents/oma-config.yaml.
Use native subagents only when target_vendor === current_runtime_vendor and that runtime supports the vendor's role-subagent path.
Otherwise use oma agent:spawn for that agent.
If Claude Code and target vendor is Claude
Use the Agent tool to spawn subagents:
Agent(subagent_type="backend-engineer", prompt="Implement backend tasks per plan. IMPORTANT: Follow .agents/skills/_shared/core/context-loading.md rules.", run_in_background=true)Agent(subagent_type="frontend-engineer", prompt="Implement frontend tasks per plan. IMPORTANT: Follow .agents/skills/_shared/core/context-loading.md rules.", run_in_background=true)- Multiple Agent tool calls in the same message = true parallel execution
If Codex CLI and target vendor is Codex
Spawn native Codex custom agents using .codex/agents/{agent}.toml when available.
Pass each agent its task description, API contracts, and relevant context.
If native dispatch is not verified in the current runtime, fall back to oma agent:spawn.
If Gemini CLI and target vendor is Gemini
Use native Gemini subagents when available, otherwise fall back to oma agent:spawn.
If target vendor differs from current runtime, or native dispatch is unavailable
oma agent:spawn backend "Implement backend tasks per plan. IMPORTANT: Follow .agents/skills/_shared/core/context-loading.md rules." session-id -w ./backend &
oma agent:spawn frontend "Implement frontend tasks per plan. IMPORTANT: Follow .agents/skills/_shared/core/context-loading.md rules." session-id -w ./frontend &
wait
Step 5.1: Monitor & Wait for Completion
Wait for all implementation agents to complete before proceeding.
- Use memory read tool to poll
progress-{agent}[-{sessionId}].mdfiles - Use MCP code analysis tools to verify implementation alignment
- Check for
result-{agent}[-{sessionId}].mdfiles to confirm completion - Use memory edit tool to record monitoring results in
session-ultrawork.md
Continue polling until all agents report completion or failure.
Step 5.2: Measure Baseline Quality Score (Conditional)
If automated measurement is available (tests, lint exist):
- Load
quality-score.md(conditional, percontext-loading.md) - Run tests, lint, type-check via Bash to measure baseline
- Create Experiment Ledger via memory tools:
[WRITE]("experiment-ledger.md", initial ledger with baseline row) - Record composite score as the IMPL baseline
If no measurement tools: skip; gates fall back to binary checklist.
IMPL_GATE
- [ ] Build succeeds
- [ ] Tests pass
- [ ] Only planned files modified
- [ ] (If measured) Baseline Quality Score recorded in Experiment Ledger
On gate pass: Use memory edit tool to record phase completion in session-ultrawork.md
Gate failure → Return to Step 5, re-spawn failed agents, and repeat monitoring until GATE passes.
Phase 3: VERIFY (Steps 6-8)
Step 6-8: QA Verification
// turbo Spawn QA Agent to execute Steps 6-8.
If Claude Code
Use the Agent tool to spawn subagent:
Agent(subagent_type="qa-reviewer", prompt="Execute Phase 3 Verification. Step 6: Alignment Review. Step 7: Security/Bug Review (npm audit, OWASP). Step 8: Improvement/Regression Review. IMPORTANT: Follow .agents/skills/_shared/core/context-loading.md rules.", run_in_background=true)
If Codex CLI
Spawn native Codex custom agents using .codex/agents/{agent}.toml when available for QA verification.
If native dispatch is not verified in the current runtime, fall back to oma agent:spawn.
If Gemini CLI or Antigravity or CLI Fallback
oma agent:spawn qa-agent "Execute Phase 3 Verification. Step 6: Alignment Review. Step 7: Security/Bug Review (npm audit, OWASP). Step 8: Improvement/Regression Review. IMPORTANT: Follow .agents/skills/_shared/core/context-loading.md rules." session-id
Monitor QA Agent Progress
Wait for QA Agent to complete verification before proceeding.
- Use memory read tool to poll
progress-qa-agent[-{sessionId}].md - Check for
result-qa-agent[-{sessionId}].mdto confirm completion- Claude-native path: the Agent tool returns synchronously and the
qa-reviewersubagent writesresult-qa[-{sessionId}].mdunder.agents/results/— check that file instead of polling.
- Claude-native path: the Agent tool returns synchronously and the
- Use memory edit tool to record QA results in
session-ultrawork.md
Continue polling until QA Agent reports completion.
Step 6: Alignment Review
- Executed by QA Agent: Compare implementation vs plan.
Step 7: Security/Bug Review (Safety)
- Executed by QA Agent: Check for vulnerabilities (Safety).
Step 8: Improvement Review (Regression Prevention)
- Executed by QA Agent: Run regression tests.
Step 8.1: Measure Post-VERIFY Quality Score (Conditional)
If baseline was measured at Step 5.2:
- Measure Quality Score incorporating QA findings
- Calculate delta from IMPL baseline
- Record as experiment in Experiment Ledger via memory tools
VERIFY_GATE
- [ ] Implementation = Requirements
- [ ] CRITICAL count: 0
- [ ] HIGH count: 0
- [ ] No regressions
- [ ] (If measured) Quality Score >= 75 (Grade B)
On gate pass: Use memory edit tool to record phase completion in session-ultrawork.md
Gate failure (1st time) → Before re-spawning for the next VERIFY cycle, check the session cost cap:
Review Loop termination conditions (OR, whichever fires first wins):
- Gate failure count has reached the configured maximum iterations (default: 5 total VERIFY + REFINE cycles). Do not start another cycle.
- Session cost cap exceeded: if
loadQuotaCap()fromcli/io/session-cost.tsreturns non-null, callcheckCap(sessionId, cap)(no cap configured → skip this condition). Ifexceeded === true, printformatPromptMessage(result)to the user and stop the loop immediately. Save all current step results before stopping, then report to the user that the loop was terminated early due to quota.If neither condition is met, return to Step 5 and continue.
Root-cause-first fix mandate: when re-spawning implementation agents to address QA findings, the fix prompt MUST require root-cause remediation. Forbid tactical patches (try/catch swallowing the error, validation bypass, hardcoded values, feature flags hiding the bug, silencing the failing test) unless the agent explicitly justifies why a structural fix is out of scope (upstream library bug, deprecated path, hotfix window).
Gate failure (2nd time on same issue, and termination conditions not yet met) → Activate Exploration Loop:
- Load
exploration-loop.md(conditional, percontext-loading.md) - Generate 2-3 alternative hypotheses using Exploration Decision template (
reasoning-templates.md#6) - Experiment each approach sequentially (git stash per attempt)
- Measure Quality Score for each
- Select the highest-scoring approach
- Record all experiments in Experiment Ledger
- Resume VERIFY with winning approach
Phase 4: REFINE (Steps 9-13)
Step 9-13: Deep Refinement
// turbo Spawn Debug Agent (or Senior Dev Agent) to execute Steps 9-13.
If Claude Code
Use the Agent tool to spawn subagent:
Agent(subagent_type="debug-investigator", prompt="Execute Phase 4 Refine. Step 9: Split large files. Step 10: Integration check. Step 11: Side Effect analysis (find_referencing_symbols). Step 12: Consistency review. Step 13: Cleanup dead code. IMPORTANT: Follow .agents/skills/_shared/core/context-loading.md rules.", run_in_background=true)
If Codex CLI
Spawn native Codex custom agents using .codex/agents/{agent}.toml when available for refinement tasks.
If native dispatch is not verified in the current runtime, fall back to oma agent:spawn.
If Gemini CLI or Antigravity or CLI Fallback
oma agent:spawn debug-agent "Execute Phase 4 Refine. Step 9: Split large files. Step 10: Integration check. Step 11: Side Effect analysis (find_referencing_symbols). Step 12: Consistency review. Step 13: Cleanup dead code. IMPORTANT: Follow .agents/skills/_shared/core/context-loading.md rules." session-id
Monitor Debug Agent Progress
Wait for Debug Agent to complete refinement before proceeding.
- Use memory read tool to poll
progress-debug-agent[-{sessionId}].md - Check for
result-debug-agent[-{sessionId}].mdto confirm completion- Claude-native path: the Agent tool returns synchronously and the
debug-investigatorsubagent writesresult-debug[-{sessionId}].mdunder.agents/results/— check that file instead of polling.
- Claude-native path: the Agent tool returns synchronously and the
- Use memory edit tool to record refinement results in
session-ultrawork.md
Continue polling until Debug Agent reports completion.
Step 9: Split Large Files/Functions
- Executed by Debug Agent: Files > 500 lines, Functions > 50 lines.
Step 10: Integration/Reuse Review (Reusability)
- Executed by Debug Agent: Check for duplicate logic.
Step 11: Side Effect Review (Cascade Impact)
- Executed by Debug Agent: Analyze impact scope.
Step 12: Full Change Review (Consistency)
- Executed by Debug Agent: Review naming and style.
Step 13: Clean Up Unused Code
- Executed by Debug Agent: Remove newly created dead code.
Step 13.1: Measure Post-REFINE Quality Score (Conditional)
If baseline was measured at Step 5.2:
- Measure Quality Score after refinement
- Calculate delta from Post-VERIFY score
- If delta < -5: Apply Discard rule. Revert refinement changes, record in Experiment Ledger.
- Record kept experiments in Experiment Ledger
REFINE_GATE
- [ ] No large files/functions
- [ ] Integration opportunities captured
- [ ] Side effects verified
- [ ] Code cleaned
- [ ] (If measured) Quality Score >= Post-VERIFY score (no regression from refinement)
On gate pass:
- Use memory edit tool to record phase completion in
session-ultrawork.md. - Emit and verify the REFINE outcome decision:
oma_emit "decision.made" '{"subject":"ultrawork.refine-outcome","decision":"Keep the REFINE changes or explicitly skip refinement.","rationale":"REFINE_GATE passed or the documented skip condition applies."}' oma state:verify --workflow ultrawork --checkpoint refine-outcome
Gate failure → Before re-spawning the Debug Agent, apply the same termination check:
Review Loop termination conditions (OR, whichever fires first wins):
- Total REFINE failure count has reached the configured maximum iterations (default: 5 cycles across all phases). Do not start another cycle.
- Session cost cap exceeded: if
loadQuotaCap()fromcli/io/session-cost.tsreturns non-null, callcheckCap(sessionId, cap)(no cap configured → skip this condition). Ifexceeded === true, printformatPromptMessage(result)to the user and stop. Save current step results before stopping, then report early termination due to quota.If neither condition is met, re-spawn the Debug Agent with specific issues and repeat until GATE passes.
Skip conditions: Simple tasks < 50 lines
Phase 5: SHIP (Steps 14-17)
Step 14-17: Final QA & Deployment Readiness
// turbo Spawn QA Agent to execute Steps 14-17.
If Claude Code
Use the Agent tool to spawn subagent:
Agent(subagent_type="qa-reviewer", prompt="Execute Phase 5 Ship. Step 14: Quality Review (lint/coverage). Step 15: UX Flow Verification. Step 16: Related Issues Review. Step 17: Deployment Readiness. IMPORTANT: Follow .agents/skills/_shared/core/context-loading.md rules.", run_in_background=true)
If Codex CLI
Spawn native Codex custom agents using .codex/agents/{agent}.toml when available for final QA and deployment readiness tasks.
If native dispatch is not verified in the current runtime, fall back to oma agent:spawn.
If Gemini CLI or Antigravity or CLI Fallback
oma agent:spawn qa-agent "Execute Phase 5 Ship. Step 14: Quality Review (lint/coverage). Step 15: UX Flow Verification. Step 16: Related Issues Review. Step 17: Deployment Readiness. IMPORTANT: Follow .agents/skills/_shared/core/context-loading.md rules." session-id
Monitor Final QA Progress
Wait for QA Agent to complete final review before proceeding.
- Use memory read tool to poll
progress-qa-agent[-{sessionId}].md - Check for
result-qa-agent[-{sessionId}].mdto confirm completion- Claude-native path: the Agent tool returns synchronously and the
qa-reviewersubagent writesresult-qa[-{sessionId}].mdunder.agents/results/— check that file instead of polling.
- Claude-native path: the Agent tool returns synchronously and the
- Use memory edit tool to record final QA results in
session-ultrawork.md
Continue polling until QA Agent reports completion.
Step 14: Code Quality Review
- Executed by QA Agent: Lint, Types, Coverage.
Step 15: UX Flow Verification
- Executed by QA Agent: User journey check.
Step 16: Related Issues Review (Cascade Impact 2nd)
- Executed by QA Agent: Final impact check.
Step 17: Deployment Readiness Review (Final)
- Executed by QA Agent: Secrets, Migrations, checklist.
Step 17.1: Final Quality Score & Session Summary (Conditional)
If Quality Score was measured during this session:
- Measure final Quality Score
- Generate Experiment Ledger summary (total experiments, keep rate, net delta)
- Auto-generate lessons from discarded experiments (delta <= -5) into
lessons-learned.md - Append Quality Score Progression and Experiment Summary to session metrics
Always (regardless of Quality Score availability): 5. Record Evaluator Accuracy events for this session:
- Review all QA findings: any disputed by impl agents? →
false_positive - Review runtime verification results: any stubs caught that static review missed? →
missed_stub - Review impl agent self-check results: any bugs caught by QA that self-check missed? →
good_catch
- Append EA events to
session-metrics.md - If rolling 3-session EA >= 30: Flag in final report
→ "QA tuning suggested. Run
oma retroto review."
SHIP_GATE
- [ ] Quality checks pass
- [ ] UX verified
- [ ] Related issues resolved
- [ ] Deployment checklist complete
- [ ] (If measured) Final Quality Score >= 75 (Grade B) with non-negative delta from baseline
- [ ] (If measured) Experiment Ledger summary recorded
- [ ] User final approval
On gate pass: Use memory write tool to record final results in session-ultrawork.md
Gate failure → Address issues, re-run affected steps, and repeat until GATE passes.
Step 18: Optional Doc Verify Hook (post-SHIP; outside the 17-step model)
If oma-config.yaml has docs.auto_verify: true:
- Run
oma docs verify --jsonfrom the repo root. - Capture the JSON output.
- If
broken.length === 0: printdocs verified clean (N docs)summary to stdout and continue with workflow completion. - If
broken.length > 0: print a 1-3 line summary identifying which docs have drift, and a hintRun /oma-docs verify for the full report.Continue with workflow completion (warn-only, never block). - If
oma-docsis not available (CLI command missing): skip silently.
This hook is opt-in; the default auto_verify: false skips this step entirely.
Review Steps Summary
| Phase | Steps | Agent | Execution | Perspective | | ------ | ----- | ----------- | --------- | --------------------------------- | | PLAN | 1-4 | PM Agent | Inline | Completeness, Meta, Simplicity | | IMPL | 5 | Dev Agents | Spawn | Implementation | | VERIFY | 6-8 | QA Agent | Spawn | Alignment, Safety, Regression | | REFINE | 9-13 | Debug Agent | Spawn | Reusability, Cascade, Consistency | | SHIP | 14-17 | QA Agent | Spawn | Quality, UX, Cascade 2nd, Deploy |
Total 11 review steps + conditional Quality Score checkpoints → High quality guaranteed
Autoresearch-Inspired Enhancements
This workflow conditionally incorporates patterns from autoresearch:
| Pattern | When Active | Reference |
|---------|-------------|-----------|
| Continuous metrics | When measurement tools available | quality-score.md (loaded at VERIFY/SHIP) |
| Keep/Discard | When quality score is measured | quality-score.md delta rules |
| Experiment logging | When baseline is established | experiment-ledger.md (via memory protocol) |
| Hypothesis exploration | On repeated gate failures | exploration-loop.md (loaded on trigger) |
| Auto-learning | At session end, if experiments exist | lessons-learned.md auto-generation |
All protocols are loaded conditionally per context-loading.md, not at Phase 0.