TDD Workflow
Strict test-driven development with Result types, dependency injection, and state machine governance.
The Iron Law
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Write code before the test? Delete it. Start over.
No exceptions:
- Don't keep it as "reference"
- Don't "adapt" it while writing tests
- Don't look at it
- Delete means delete
Implement fresh from tests. Period.
Rationale: Code written before tests is biased by implementation. TDD requires tests to drive design, not verify existing code.
CRITICAL: STATE MACHINE GOVERNANCE
EVERY SINGLE MESSAGE MUST START WITH YOUR CURRENT TDD STATE
Format:
βͺ TDD: PLANNING
π΄ TDD: RED
π’ TDD: GREEN
π΅ TDD: REFACTOR
π‘ TDD: VERIFY
β οΈ TDD: BLOCKED
π₯ TDD: VIOLATION
NOT JUST THE FIRST MESSAGE. EVERY. SINGLE. MESSAGE.
When you read a file β prefix with TDD state When you run tests β prefix with TDD state When you explain results β prefix with TDD state When you ask a question β prefix with TDD state
Example:
βͺ TDD: PLANNING
Writing test for getUser returning NOT_FOUND...
βͺ TDD: PLANNING
Running npm test to see it fail...
π΄ TDD: RED
Test fails correctly. Implementing minimum solution...
π’ TDD: GREEN
Test passes. Checking if refactor needed...
State Machine Diagram
user request
β
ββββββββββββ
ββββββ PLANNING ββββββ
β βββββββ¬βββββ β
β β β
β test fails β
β correctly β
unclear β β β blocker
β ββββββββββββ β
ββββββ RED β β
β β β
β Test IS β β
β failing β β
ββββββ¬ββββββ β
β β
test β β
passes β β
β β
ββββββββββββ β
β GREEN β β
β β β
β Test IS β β
β passing β β
ββββββ¬βββββββββββ
β
refactoring β
needed β
β
ββββββββββββ
ββββββ REFACTOR β
β β β
β β Improve β
β β design β
β ββββββ¬ββββββ
β β
β done β
β β
β β
β ββββββββββββ
β β VERIFY β
β β β
β β Run full β
fail β β suite + β
β β lint + β
ββββββ build β
ββββββ¬ββββββ
β
pass β
β
β
[COMPLETE]
State: PLANNING
Prefix: βͺ TDD: PLANNING
Purpose: Write a failing test that proves the requirement.
Pre-Conditions
- User has provided a task/requirement/bug report
- No other TDD cycle in progress
The Delete Rule
If production code exists before the test:
- Delete it immediately
- Do not keep as "reference"
- Do not "adapt" it
- Start fresh from the test
Rationale: Code written before tests is biased by implementation. TDD requires tests to drive design, not verify existing code.
Actions
- Analyze requirement - what behavior needs testing?
- Identify edge cases
- Write test using fn(args, deps) pattern
- Use vitest-mock-extended:
const deps = mock<GetUserDeps>() - Run test (use Bash tool)
- VERIFY test fails correctly (MANDATORY - see Verify RED below)
- Show exact failure message verbatim
- Justify why failure proves test is correct
Verify RED - Watch It Fail
MANDATORY. Never skip.
npm test path/to/test.test.ts
Confirm:
- Test fails (not errors)
- Failure message is expected
- Fails because feature missing (not typos)
Test passes? You're testing existing behavior. Fix test. Test errors? Fix error, re-run until it fails correctly.
Post-Conditions (ALL required before transition)
- [ ] Test written with proper deps mocking
- [ ] Test executed
- [ ] Test FAILED correctly (not setup error) - VERIFIED by watching it fail
- [ ] Failure message shown verbatim
- [ ] Failure is "meaningful" (assertion failure, not import error)
Validation Before Transition
Pre-transition validation:
β Test written: [yes/no]
β Test executed: [yes/no]
β Test failed correctly: [yes - output above]
β Meaningful failure: [yes - justification]
Transitioning to RED.
Transitions
- PLANNING β RED (test fails correctly)
- PLANNING β BLOCKED (cannot write valid test)
State: RED
Prefix: π΄ TDD: RED
Purpose: Test IS failing. Implement ONLY what the error message demands.
Pre-Conditions
- Test written and executed
- Test IS FAILING correctly
- Failure is meaningful (not setup/syntax error)
The Hardcode Rule
Before implementing, announce:
Minimal implementation check:
- Error demands: [what the error literally says]
- Could hardcoded value work? [yes/no]
- If yes: [what hardcoded value]
- If no: [why real logic is required]
| Error Says | Do This |
|-----------|---------|
| expected err('NOT_FOUND') | return err('NOT_FOUND') |
| expected ok(user) | return ok(mockUser) |
| expected count: 0 | return { count: 0 } |
Only add logic when tests FORCE you to.
Actions
- Read error message - what does it literally ask for?
- Announce minimal implementation check
- Implement ONLY what error demands (hardcode if possible)
- Run test
- VERIFY test PASSES (MANDATORY - see Verify GREEN below)
- Run typecheck (
npm run typecheck) - Run lint (
npm run lint) - Show all output verbatim
- Transition to GREEN
Verify GREEN - Watch It Pass
MANDATORY. Never skip.
npm test path/to/test.test.ts
Confirm:
- Test passes
- Other tests still pass
- Output pristine (no errors, warnings)
Test fails? Fix code, not test. Other tests fail? Fix now.
Post-Conditions (ALL required)
- [ ] Implemented ONLY what error demanded
- [ ] Test executed and PASSES
- [ ] Code compiles (no TypeScript errors)
- [ ] Code lints (no ESLint errors)
- [ ] All output shown verbatim
Validation Before Transition
Post-condition validation:
β Test PASSES: [yes - output above]
β Code compiles: [yes - output above]
β Code lints: [yes - output above]
β Implementation minimal: [justification]
Transitioning to GREEN.
Critical Rules
- NEVER transition to GREEN without test PASS + compile + lint
- NEVER anticipate future errors - address THIS error only
- NEVER change test to match implementation - fix the code
Transitions
- RED β GREEN (test passes, code compiles, code lints)
- RED β BLOCKED (cannot make test pass)
- RED β PLANNING (test failure reveals misunderstood requirement)
State: GREEN
Prefix: π’ TDD: GREEN
Purpose: Test IS passing. Assess code quality and decide next step.
Pre-Conditions
- Test exists and PASSES
- Code compiles and lints
- Implementation is minimal
Actions
- Review implementation against patterns:
| Pattern | Check | |---------|-------| | fn(args, deps) | Is deps type explicit and minimal? | | Result types | Are all error cases typed? | | Validation | Is Zod at boundary only? | | Naming | Domain-specific, not generic? |
- Decide: Does code need refactoring?
- If YES β REFACTOR
- If NO β VERIFY
Post-Conditions
- [ ] Test IS PASSING
- [ ] Code quality assessed
- [ ] Decision made: refactor or verify
Transitions
- GREEN β REFACTOR (improvements needed)
- GREEN β VERIFY (code is clean)
- GREEN β RED (test starts failing - regression)
State: REFACTOR
Prefix: π΅ TDD: REFACTOR
Purpose: Tests ARE passing. Improve design while keeping green.
Pre-Conditions
- Tests ARE PASSING
- Refactoring needs identified
Refactor Checklist
// 1. Extract deps type
// BEFORE
async function getUser(
args: { userId: string },
deps: { db: Database; logger: Logger }
)
// AFTER
type GetUserDeps = { db: Database; logger: Logger };
async function getUser(args: { userId: string }, deps: GetUserDeps)
// 2. Type all errors
// BEFORE
return err('error')
// AFTER
type GetUserError = 'NOT_FOUND' | 'DB_ERROR';
return err<GetUserError>('NOT_FOUND')
// 3. Use domain names
// BEFORE
const data = await deps.db.get(id);
// AFTER
const user = await deps.db.findUser(userId);
Actions
- Apply ONE refactoring
- Run tests - verify still pass
- Repeat until no more improvements
- Transition to VERIFY
Post-Conditions
- [ ] Deps type is explicit and exported
- [ ] All error types are explicit in signature
- [ ] Names are domain-specific
- [ ] Tests still pass after each refactor
When to Skip REFACTOR
- Code is already clean
- Single hardcoded value (generalize when second test forces it)
Never skip when:
- Deps type missing or inline
- Error types not explicit
- Generic names (data, util, result)
Transitions
- REFACTOR β VERIFY (code quality satisfactory)
- REFACTOR β RED (refactor broke test)
- REFACTOR β BLOCKED (cannot refactor due to constraints)
State: VERIFY
Prefix: π‘ TDD: VERIFY
Purpose: Run full test suite + lint + build before claiming complete.
Actions
- Run full test suite:
npm test - Run lint:
npm run lint - Run typecheck:
npm run typecheck - Run build:
npm run build(if applicable) - Show ALL output verbatim
- If ALL pass β COMPLETE
- If ANY fail β route appropriately
Post-Conditions (ALL required)
- [ ] Full test suite executed - ALL pass
- [ ] Lint executed - passes
- [ ] Typecheck executed - passes
- [ ] Build executed - succeeds
- [ ] All output shown
Validation Before Completion
Final validation:
β Full test suite: [X/X tests passed - output shown]
β Lint: [passed - output shown]
β Typecheck: [passed - output shown]
β Build: [succeeded - output shown]
TDD cycle COMPLETE.
Summary:
- Tests written: [count]
- Refactorings: [count]
Transitions
- VERIFY β COMPLETE (all checks pass)
- VERIFY β RED (tests fail - regression)
- VERIFY β REFACTOR (lint fails - code quality)
- VERIFY β BLOCKED (build fails - structural issue)
State: BLOCKED
Prefix: β οΈ TDD: BLOCKED
Purpose: Handle situations where progress cannot continue.
Actions
- Clearly explain blocking issue
- Explain which state you were in
- Explain what you were trying to do
- Suggest possible resolutions
- STOP and wait for user guidance
Critical Rules
- NEVER improvise workarounds
- NEVER skip steps to "unblock" yourself
- ALWAYS stop and wait for user
Transitions
- BLOCKED β [any state] (based on user guidance)
State: VIOLATION
Prefix: π₯ TDD: VIOLATION
Purpose: Handle state machine violations.
Triggers
- Forgot state announcement
- Skipped state
- Failed to validate post-conditions
- Claimed complete without evidence
- Changed test assertion to match implementation
- Implemented full solution when hardcode would work
Actions
- IMMEDIATELY announce:
π₯ TDD: VIOLATION - Explain which rule was violated
- Explain what you did wrong
- Announce correct current state
- Ask user permission to recover
Example
π₯ TDD: VIOLATION
Violation: Implemented full user lookup logic when hardcoded
value would satisfy the single test case.
Correct action: Return hardcoded { id: '123', name: 'Alice' }
Recovering to RED to fix implementation...
Test Structure for fn(args, deps)
import { describe, it, expect } from 'vitest';
import { mock } from 'vitest-mock-extended';
import { getUser, type GetUserDeps } from './get-user';
describe('getUser', () => {
it('returns err NOT_FOUND when user does not exist', async () => {
// Arrange: mock deps
const deps = mock<GetUserDeps>();
deps.db.findUser.mockResolvedValue(null);
// Act
const result = await getUser({ userId: '123' }, deps);
// Assert Result type
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error).toBe('NOT_FOUND');
}
});
it('returns ok with user when found', async () => {
const mockUser = { id: '123', name: 'Alice' };
const deps = mock<GetUserDeps>();
deps.db.findUser.mockResolvedValue(mockUser);
const result = await getUser({ userId: '123' }, deps);
expect(result.ok).toBe(true);
if (result.ok) {
expect(result.value).toEqual(mockUser);
}
});
});
Meaningful vs Setup Failures
| Type | Example | Action | |------|---------|--------| | Meaningful | "expected err('NOT_FOUND') but got ok(user)" | Proceed to RED | | Meaningful | "expected false, received true" | Proceed to RED | | Setup | "Cannot find module './get-user'" | Fix import, stay in PLANNING | | Setup | "Property 'findUser' does not exist" | Fix mock, stay in PLANNING | | Setup | "TypeError: deps.db is undefined" | Fix setup, stay in PLANNING |
Red Flags - STOP and Start Over
If you encounter any of these, delete code and start over with TDD:
- Code before test
- Test after implementation
- Test passes immediately (didn't watch it fail)
- Can't explain why test failed
- Tests added "later"
- Rationalizing "just this once"
- "I already manually tested it"
- "Keep as reference" or "adapt existing code"
- "Already spent X hours, deleting is wasteful"
- "TDD is dogmatic, I'm being pragmatic"
- "This is different because..."
All of these mean: Delete code. Start over with TDD.
Common Rationalizations
| Excuse | Reality | |--------|--------| | "Too simple to test" | Simple code breaks. Test takes 30 seconds. | | "I'll test after" | Tests passing immediately prove nothing. | | "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" | | "Already manually tested" | Ad-hoc β systematic. No record, can't re-run. | | "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. | | "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. | | "Need to explore first" | Fine. Throw away exploration, start with TDD. | | "Test hard = design unclear" | Listen to test. Hard to test = hard to use. | | "TDD will slow me down" | TDD faster than debugging. Pragmatic = test-first. | | "Manual test faster" | Manual doesn't prove edge cases. You'll re-test every change. |
Thinking "skip TDD just this once"? Stop. That's rationalization.
Critical Rules Summary
| Rule | Enforcement | |------|-------------| | Iron Law | NO production code without failing test first | | Delete code rule | If code exists before test, delete it | | State announcement | EVERY message starts with state | | Verify RED | MANDATORY - watch test fail | | Verify GREEN | MANDATORY - watch test pass | | No green without proof | Must see test pass output | | Error message driven | Implement ONLY what error demands | | Hardcode first | One test? Return expected value | | Never change test | Fix implementation, not assertion | | Validate before transition | Check all post-conditions |
Never Use vi.mock() for App Logic
// WRONG: vi.mock creates brittle path coupling
vi.mock('../infra/database', () => ({ db: mockDb }));
// CORRECT: Inject deps, mock with vitest-mock-extended
const deps = mock<GetUserDeps>();
deps.db.findUser.mockResolvedValue(mockUser);
Quick Reference
| State | Prefix | Action | Exit Condition | |-------|--------|--------|----------------| | PLANNING | βͺ | Write failing test | Test fails meaningfully | | RED | π΄ | Minimum implementation | Test passes + compile + lint | | GREEN | π’ | Assess quality | Decision: refactor or verify | | REFACTOR | π΅ | Improve design | Tests still pass, code clean | | VERIFY | π‘ | Full suite + lint + build | All green | | BLOCKED | β οΈ | Explain blocker, wait | User guidance | | VIOLATION | π₯ | Acknowledge, recover | Permission to continue |