Announce: "Using dev-design (Phase 4) to propose implementation approaches and obtain user approval."
Contents
Architecture Design with User Gate
Propose implementation approaches, explain trade-offs, get user approval. Prerequisites: SPEC.md finalized, exploration complete, clarifications resolved.
<EXTREMELY-IMPORTANT> ## The Iron Law of DesignYOU MUST GET USER APPROVAL BEFORE IMPLEMENTATION. This is not negotiable.
After presenting approaches:
- Show 2-3 options with trade-offs
- Lead with your recommendation
- Ask user which approach
- Wait for explicit approval
Implementation CANNOT start without user saying "Yes" or choosing an approach.
STOP - you're about to implement without user approval. </EXTREMELY-IMPORTANT>
What Design Does
| DO | DON'T | |----|-------| | Propose 2-3 approaches | Implement anything | | Explain trade-offs clearly | Make the choice for user | | Lead with recommendation | Present without opinion | | Get explicit approval | Assume approval | | Write PLAN.md | Skip the user gate |
Design answers: HOW to build it and WHY this approach Implement executes: the approved approach (next phase, after gate)
Process
1. Review Inputs
Before designing, ensure the following exist:
.claude/SPEC.md- final requirements- Exploration findings - key files, patterns
- Clarified decisions - edge cases, integrations
2. Propose 2-3 Approaches
Each approach should address the same requirements differently:
Approach A: Minimal Changes
- Smallest diff, maximum reuse
- Trade-off: May be less clean, tech debt
Approach B: Clean Architecture
- Best patterns, maintainability
- Trade-off: More changes, longer implementation
Approach C: Pragmatic Balance
- Balance of speed and quality
- Trade-off: Compromise on both
3. Present with Trade-offs
Use the AskUserQuestion tool to present approaches:
# AskUserQuestion: Present 2-3 architecture approaches with trade-offs for user selection
AskUserQuestion(questions=[{
"question": "Which architecture approach should we use?",
"header": "Architecture",
"options": [
{
"label": "Pragmatic Balance (Recommended)",
"description": "Extend existing AuthService with new method. ~150 lines changed. Balances reuse with clean separation."
},
{
"label": "Minimal Changes",
"description": "Add logic to existing endpoint. ~50 lines changed. Fast but increases coupling."
},
{
"label": "Clean Architecture",
"description": "New service with full abstraction. ~300 lines. Most maintainable but longest to build."
}
],
"multiSelect": false
}])
Key principles:
- Lead with recommendation (first option + "Recommended")
- Concrete numbers (lines changed, files affected)
- Clear trade-offs for each
- Reference specific files from exploration
4. Feature Decomposition Check
CRITICAL: Before writing PLAN.md, check if this is actually multiple features.
Review the scope and ask:
# AskUserQuestion: Determine if feature should be split into independent tasks
AskUserQuestion(questions=[{
"question": "Is this one cohesive feature or multiple independent features?",
"header": "Scope",
"options": [
{
"label": "One feature",
"description": "Implement everything together in one branch/worktree"
},
{
"label": "Multiple features",
"description": "Break into separate features, each with own branch/worktree/PR"
}
],
"multiSelect": false
}])
If "Multiple features":
-
List the independent features identified from SPEC.md:
Based on the requirements, this breaks into: 1. Theme infrastructure (color system, theme provider) 2. Settings UI (theme selector component) 3. Component updates (update 20+ components to use theme) 4. Persistence layer (save user preference) Each can be implemented and PR'd independently. -
Ask which to tackle first:
# AskUserQuestion: Prioritize which feature to implement first AskUserQuestion(questions=[{ "question": "Which feature should we implement first?", "header": "Priority", "options": [ {"label": "Theme infrastructure (Recommended)", "description": "Foundation that others depend on"}, {"label": "Settings UI", "description": "UI for theme selection"}, {"label": "Component updates", "description": "Apply themes to components"}, {"label": "Persistence layer", "description": "Save user preference"} ], "multiSelect": false }]) -
Write PLAN.md for ONLY the chosen feature
-
Document remaining features in
.claude/BACKLOG.md:# Feature Backlog ## Dark Mode Implementation ### Completed - [ ] None yet ### Next Up - [ ] Theme infrastructure - [ ] Settings UI - [ ] Component updates - [ ] Persistence layer **Current Focus:** Theme infrastructure
If "One feature":
Proceed to write PLAN.md for the entire scope (step 5 below).
Why this matters:
- Multiple features in one branch = massive PR, review hell, merge conflicts
- Separate features = clean PRs, incremental progress, easier reviews
- After first feature PR merges, come back and tackle next feature
5. Write PLAN.md
After user chooses approach AND confirms scope, write .claude/PLAN.md:
# Implementation Plan: [Feature]
> **For Claude:** REQUIRED SUB-SKILL: Invoke `Read("${CLAUDE_PLUGIN_ROOT}/lib/skills/dev-implement/SKILL.md")` to implement this plan.
>
> **Per-Task Ralph Loops:** Assign each task its OWN ralph loop. Do NOT combine multiple tasks into one loop.
>
> **Delegation:** Main chat orchestrates, Task agents implement. Use `Read("${CLAUDE_PLUGIN_ROOT}/lib/skills/dev-delegate/SKILL.md")` for subagent templates.
## Chosen Approach
[Name]: [Brief description]
## Rationale
- [Why this approach fits]
- [Trade-offs accepted]
## Testing Strategy (MANDATORY - GATE)
> **⚠️ THIS SECTION MUST BE COMPLETE BEFORE IMPLEMENTATION.**
> **If any field is empty, implementation CANNOT proceed.**
| Field | Value | Status |
|-------|-------|--------|
| **Framework** | [pytest / jest / playwright / etc.] | [ ] Filled |
| **Test Command** | [e.g., `pytest tests/ -v`] | [ ] Filled |
| **First Failing Test** | [Description of what test will fail first] | [ ] Filled |
| **Test File Location** | [e.g., `tests/test_feature.py`] | [ ] Filled |
| **Testing Skill** | [dev-test-electron / dev-test-playwright / etc.] | [ ] Filled |
## REAL Test Criteria (MANDATORY - PREVENTS FAKE TESTS)
> **⚠️ A test that doesn't replicate user workflow is a FAKE test.**
> **If this section is empty or incorrect, you WILL write fake tests.**
| Criteria | Value | Verified |
|----------|-------|----------|
| **User workflow to replicate** | [e.g., "highlight → click panel → see status"] | [ ] |
| **Protocol/transport** | [e.g., "WebSocket" - must match production] | [ ] |
| **UI elements to interact with** | [e.g., "Claude terminal panel"] | [ ] |
| **What user sees/verifies** | [e.g., "'⧉ X lines selected' in status"] | [ ] |
| **Code path exercised** | [e.g., "selection → WebSocket → panel update"] | [ ] |
### Fake Test Prevention Checklist
Before implementation, verify:
[ ] Test uses SAME protocol as production (not a different one) [ ] Test follows user's EXACT workflow (not a shortcut) [ ] Test interacts with ACTUAL UI elements (not direct function calls) [ ] Test verifies what USER sees (not internal state) [ ] Test uses the SPECIFIED testing skill (not your own approach)
**If ANY box is unchecked → You WILL write fake tests. Fix now.**
### Examples of REAL vs FAKE Tests
| Feature Type | FAKE Test (DON'T) | REAL Test (DO) |
|--------------|-------------------|----------------|
| WebSocket feature | HTTP endpoint test | WebSocket connection test |
| GraphQL mutation | REST mock | GraphQL client call |
| UI button click | `button.onClick()` direct call | Playwright/ydotool click |
| CLI command | Call internal function | Invoke actual binary |
| API endpoint | Unit test handler | HTTP request to server |
| Async operation | Sync function call | Await actual promise |
| Database query | Mock repository | Test against real/test DB |
| Electron app | Mock IPC layer | CDP automation |
### The Iron Law of This Plan
**NO TASK STARTS UNTIL ITS TEST IS WRITTEN.**
For each task below:
1. Write the test FIRST (RED)
2. Run the test, see it FAIL
3. Implement the code (GREEN)
4. Refactor if needed
**If you skip the test, DELETE your implementation and start over.**
### What Counts as a REAL Test
| ✅ REAL (execute + verify) | ❌ NOT A TEST (never do this) |
|----------------------------|-------------------------------|
| pytest calls function | grep for function exists |
| Playwright clicks button | ast-grep finds pattern |
| API request checks response | Log says "success" |
| Screenshot comparison | "Code looks correct" |
**Every task MUST have a test that EXECUTES the code and VERIFIES behavior.**
### Rationalization Prevention (No Tests)
If you catch yourself thinking these thoughts, STOP:
| Thought | Reality |
|---------|---------|
| "No test infrastructure exists" | Add it as Task 0. That's the plan now. |
| "This is hard to test" | Use E2E tools (Playwright, ydotool). Ask user. |
| "I'll add tests later" | No. TDD means tests FIRST. |
| "Just this one task without tests" | No exceptions. Ever. |
| "Manual testing is in SPEC.md" | That's wrong. Fix it or ask user. |
| "User approved manual testing" | Push back. TDD is the workflow. |
### Rationalization Prevention (Fake Tests)
If you catch yourself thinking these thoughts, STOP - you're about to write a FAKE test:
| Thought | Reality | REAL Action |
|---------|---------|-------------|
| "Testing HTTP is easier" | But app uses WebSocket | Test WebSocket |
| "I can call the function directly" | That skips user workflow | Simulate user action |
| "I know a better way to test this" | User specified a skill | Use the specified skill |
| "Internal state check is more direct" | User doesn't see internal state | Verify user-visible output |
| "Mocking is simpler" | Mocks hide real bugs | Test actual integration |
| "The test fails, let me fix the assertion" | Maybe the test is wrong | Question if test is valid |
| "User workflow is too complex" | That's what user actually does | Replicate it anyway |
| "This shortcut tests the same thing" | No it doesn't | Follow exact workflow |
## Files to Modify
| File | Change |
|------|--------|
| `src/auth/service.ts` | Add `validateSession()` method |
| `src/routes/api.ts` | Add new endpoint |
## New Files
| File | Purpose |
|------|---------|
| `src/auth/types.ts` | Session type definitions |
## Implementation Order (with Per-Task Ralph Loops)
> **For Claude:** Each task = one ralph loop. Complete task N before starting task N+1.
>
> **TDD ENFORCEMENT:** Every task with code MUST have a failing test written BEFORE implementation.
>
> Pattern: `Skill(skill="ralph-loop:ralph-loop", args="Task N: [name] --max-iterations 10 --completion-promise TASKN_DONE")`
>
> **Task Dependencies:** Mark each task's `Deps` column: `—` = no dependencies (parallelizable), `after N` = must follow task N. Tasks with `—` or the same dependency can run in parallel when using agent team mode in dev-implement.
| Task | Deps | Ralph Loop | Failing Test (write FIRST) | Verify Command |
|------|------|------------|----------------------------|----------------|
| 0. Test infrastructure (if needed) | — | `"Task 0: Test setup" → TASK0_DONE` | N/A (meta-task) | `pytest --version` or `npm test -- --version` |
| 1. Add types | after 0 | `"Task 1: Add types" → TASK1_DONE` | N/A (types only) | `tsc --noEmit` |
| 2. Service method | after 1 | `"Task 2: Service method" → TASK2_DONE` | `test_validate_session()` - write test, see RED, then implement | `pytest tests/test_auth.py -v` |
| 3. Route handler | after 1 | `"Task 3: Route handler" → TASK3_DONE` | `test_api_endpoint()` - write test, see RED, then implement | `pytest tests/test_api.py -v` |
6. User Gate - Final Approval
After writing PLAN.md, get explicit approval:
AskUserQuestion(questions=[{
"question": "Ready to start implementation?",
"header": "Approval",
"options": [
{"label": "Yes, proceed", "description": "Start implementation with TDD"},
{"label": "No, discuss changes", "description": "Modify the plan first"}
],
"multiSelect": false
}])
If "No": Wait for user feedback, modify plan, ask again.
If "Yes": Proceed to workspace setup question in Step 7 below.
7. Workspace Setup Question
After user approves implementation, ask about worktree isolation:
AskUserQuestion(questions=[{
"question": "Create isolated worktree for this feature?",
"header": "Workspace",
"options": [
{"label": "Yes (Recommended)", "description": "Work in isolated .worktrees/ directory - keeps main workspace clean"},
{"label": "No", "description": "Work in current directory"}
],
"multiSelect": false
}])
If "Yes (Recommended)":
Invoke the dev-worktree skill:
# dev-worktree: Create isolated git worktree for feature development
Skill(skill="workflows:dev-worktree")
Then after worktree is created, invoke dev-implement.
If "No":
Directly invoke dev-implement in current directory without worktree isolation.
Approach Categories
| Category | When to Use | Trade-off | |----------|-------------|-----------| | Minimal | Bug fixes, small features | Speed vs cleanliness | | Clean | New systems, core features | Quality vs time | | Pragmatic | Most features | Balance |
PLAN.md Format
Required sections:
- Chosen Approach - What was selected and why
- Testing Strategy - Framework, command, first test, location, skill
- REAL Test Criteria - User workflow, protocol, UI elements, what user sees
- Files to Modify - Specific paths with change descriptions
- New Files - If any, with purposes
- Implementation Order - Ordered task list with dependencies
The Gate Function
Complete all steps before starting implementation:
1. REVIEW → Read SPEC.md and exploration findings
2. VERIFY TESTING → Check SPEC.md has automated testing strategy
└─ If missing → STOP. Go back to clarify phase.
3. PROPOSE → Present 2-3 approaches with trade-offs
4. ASK → Use AskUserQuestion with clear options
5. DECOMPOSE → Ask "One feature or multiple?" (CRITICAL)
└─ If multiple → List features, ask which first, write BACKLOG.md
6. WAIT → Do NOT proceed until user responds
7. DOCUMENT → Write PLAN.md with Testing Strategy section FILLED
8. VERIFY PLAN → Check PLAN.md Testing Strategy table has all boxes checked
└─ If any unchecked → STOP. Fill them before proceeding.
9. CONFIRM → Ask "Ready to proceed?"
10. WORKSPACE → Ask "Create worktree?" (Yes recommended / No)
11. SETUP → If worktree Yes, invoke dev-worktree
12. GATE → Only start /dev-implement after all approvals
Mandatory steps (NEVER skip): VERIFY TESTING, DECOMPOSE, VERIFY PLAN, WAIT, WORKSPACE, and GATE.
Testing Strategy Verification (Step 2 & 8)
Before proceeding past step 2, verify SPEC.md contains:
[ ] Testing approach (unit/integration/E2E)
[ ] Test framework (pytest/jest/playwright)
[ ] Test command (how to run)
[ ] Testing skill specified (dev-test-electron/playwright/etc.)
Before proceeding past step 8, verify PLAN.md Testing Strategy table:
[ ] Framework filled
[ ] Test Command filled
[ ] First Failing Test described
[ ] Test File Location specified
[ ] Testing Skill specified
If any box is unchecked → STOP. Do not proceed.
REAL Test Verification (Step 8 - CRITICAL)
Before proceeding past step 8, verify PLAN.md REAL Test Criteria table:
[ ] User workflow documented (exact steps user takes)
[ ] Protocol matches production (WebSocket/HTTP/IPC/etc.)
[ ] UI elements identified (what user interacts with)
[ ] User-visible output documented (what user sees)
[ ] Code path specified (same path as production)
If any box is unchecked → You WILL write fake tests. Fix now.
Fake Test Prevention Check (Step 8)
Ask yourself before proceeding:
- Will the test use the SAME protocol as production? (Not a substitute)
- Will the test follow the EXACT user workflow? (Not shortcuts)
- Will the test use the SPECIFIED testing skill? (Not your own approach)
- Will the test verify what the USER sees? (Not internal state)
If ANY answer is "no" → STOP. Fix the REAL Test Criteria section.
Rationalization Prevention
Recognize these thoughts as red flags—they signal attempts to bypass the user gate:
| Thought | Reality | |---------|---------| | "User will approve this" | Your assumption ≠ approval. Ask and wait. | | "It's the obvious choice" | User decides what's obvious. Present options. | | "Let me just start" | NO. Gate exists for a reason. Wait. | | "User said they trust me" | Trust doesn't mean skip approval. Ask. | | "Time pressure" | You'll waste more time with the wrong approach. Wait for approval. | | "Only one viable option" | Present it anyway. User may see alternatives. | | "Ask forgiveness later" | No. Ask permission now. |
Red Flags - STOP If You're About To:
| Action | Why It's Wrong | Do Instead | |--------|----------------|------------| | Present only one approach | You're removing user choice | Always show 2-3 options | | Skip trade-offs | You're making decision for user | Explain pros/cons clearly | | Start implementing | You don't have approval yet | Wait for explicit "Yes" | | Assume recommendation accepted | You're guessing at user preference | Ask and wait for answer |
Output
Design complete when:
- 2-3 approaches presented with trade-offs
- User chose an approach
.claude/PLAN.mdwritten with chosen approach- User explicitly approved ("Yes, proceed")
Phase Complete
After user approves ("Yes, proceed"):
- Ask about worktree (Step 7 above)
- If worktree chosen:
- Invoke
Skill(skill="workflows:dev-worktree") - After worktree created, invoke
Read("${CLAUDE_PLUGIN_ROOT}/lib/skills/dev-implement/SKILL.md")
- Invoke
- If no worktree:
- Directly invoke
Read("${CLAUDE_PLUGIN_ROOT}/lib/skills/dev-implement/SKILL.md")
- Directly invoke
Required before proceeding:
- Explicit user approval for implementation
- Feature scope decision (one feature vs multiple)
- User choice on worktree (Yes/No)
After this feature is implemented and PR'd:
If multiple features were identified in step 4, check .claude/BACKLOG.md for remaining features:
- View remaining features in BACKLOG.md
- Invoke
/devagain to tackle the next feature - Repeat until all features are complete
This enables incremental development: one feature → PR → merge → next feature.