Agent Skills: TDD Workflow

Strict TDD state machine with Result types. 7-state workflow (PLANNING, RED, GREEN, REFACTOR, VERIFY, BLOCKED, VIOLATION). Every message MUST announce state. Integrates with fn(args, deps), vitest-mock-extended, and typed error handling.

UncategorizedID: jagreehal/jagreehal-claude-skills/tdd-workflow

Install this agent skill to your local

pnpm dlx add-skill https://github.com/jagreehal/jagreehal-claude-skills/tree/HEAD/skills/tdd-workflow

Skill Files

Browse the full folder contents for tdd-workflow.

Download Skill

Loading file tree…

skills/tdd-workflow/SKILL.md

Skill Metadata

Name
tdd-workflow
Description
"Strict TDD state machine with Result types. 7-state workflow (PLANNING, RED, GREEN, REFACTOR, VERIFY, BLOCKED, VIOLATION). Every message MUST announce state. Integrates with fn(args, deps), vitest-mock-extended, and typed error handling."

TDD Workflow

Strict test-driven development with Result types, dependency injection, and state machine governance.

The Iron Law

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

Write code before the test? Delete it. Start over.

No exceptions:

  • Don't keep it as "reference"
  • Don't "adapt" it while writing tests
  • Don't look at it
  • Delete means delete

Implement fresh from tests. Period.

Rationale: Code written before tests is biased by implementation. TDD requires tests to drive design, not verify existing code.

CRITICAL: STATE MACHINE GOVERNANCE

EVERY SINGLE MESSAGE MUST START WITH YOUR CURRENT TDD STATE

Format:

βšͺ TDD: PLANNING
πŸ”΄ TDD: RED
🟒 TDD: GREEN
πŸ”΅ TDD: REFACTOR
🟑 TDD: VERIFY
⚠️ TDD: BLOCKED
πŸ”₯ TDD: VIOLATION

NOT JUST THE FIRST MESSAGE. EVERY. SINGLE. MESSAGE.

When you read a file β†’ prefix with TDD state When you run tests β†’ prefix with TDD state When you explain results β†’ prefix with TDD state When you ask a question β†’ prefix with TDD state

Example:

βšͺ TDD: PLANNING
Writing test for getUser returning NOT_FOUND...

βšͺ TDD: PLANNING
Running npm test to see it fail...

πŸ”΄ TDD: RED
Test fails correctly. Implementing minimum solution...

🟒 TDD: GREEN
Test passes. Checking if refactor needed...

State Machine Diagram

                  user request
                       ↓
                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”Œβ”€β”€β”€β”€β”‚ PLANNING │────┐
            β”‚    β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜    β”‚
            β”‚          β”‚         β”‚
            β”‚  test fails        β”‚
            β”‚  correctly         β”‚
  unclear   β”‚          ↓         β”‚ blocker
            β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
            └────│   RED    β”‚    β”‚
                 β”‚          β”‚    β”‚
                 β”‚ Test IS  β”‚    β”‚
                 β”‚ failing  β”‚    β”‚
                 β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜    β”‚
                      β”‚          β”‚
              test    β”‚          β”‚
              passes  β”‚          β”‚
                      ↓          β”‚
                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
                 β”‚  GREEN   β”‚    β”‚
                 β”‚          β”‚    β”‚
                 β”‚ Test IS  β”‚    β”‚
                 β”‚ passing  β”‚    β”‚
                 β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜β”€β”€β”€β”€β”˜
                      β”‚
          refactoring β”‚
          needed      β”‚
                      ↓
                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”Œβ”€β”€β”€β”€β”‚ REFACTOR β”‚
            β”‚    β”‚          β”‚
            β”‚    β”‚ Improve  β”‚
            β”‚    β”‚ design   β”‚
            β”‚    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
            β”‚         β”‚
            β”‚    done β”‚
            β”‚         β”‚
            β”‚         ↓
            β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚    β”‚  VERIFY  β”‚
            β”‚    β”‚          β”‚
            β”‚    β”‚ Run full β”‚
  fail      β”‚    β”‚ suite +  β”‚
            β”‚    β”‚ lint +   β”‚
            └────│ build    β”‚
                 β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
                      β”‚
                 pass β”‚
                      β”‚
                      ↓
                 [COMPLETE]

State: PLANNING

Prefix: βšͺ TDD: PLANNING

Purpose: Write a failing test that proves the requirement.

Pre-Conditions

  • User has provided a task/requirement/bug report
  • No other TDD cycle in progress

The Delete Rule

If production code exists before the test:

  1. Delete it immediately
  2. Do not keep as "reference"
  3. Do not "adapt" it
  4. Start fresh from the test

Rationale: Code written before tests is biased by implementation. TDD requires tests to drive design, not verify existing code.

Actions

  1. Analyze requirement - what behavior needs testing?
  2. Identify edge cases
  3. Write test using fn(args, deps) pattern
  4. Use vitest-mock-extended: const deps = mock<GetUserDeps>()
  5. Run test (use Bash tool)
  6. VERIFY test fails correctly (MANDATORY - see Verify RED below)
  7. Show exact failure message verbatim
  8. Justify why failure proves test is correct

Verify RED - Watch It Fail

MANDATORY. Never skip.

npm test path/to/test.test.ts

Confirm:

  • Test fails (not errors)
  • Failure message is expected
  • Fails because feature missing (not typos)

Test passes? You're testing existing behavior. Fix test. Test errors? Fix error, re-run until it fails correctly.

Post-Conditions (ALL required before transition)

  • [ ] Test written with proper deps mocking
  • [ ] Test executed
  • [ ] Test FAILED correctly (not setup error) - VERIFIED by watching it fail
  • [ ] Failure message shown verbatim
  • [ ] Failure is "meaningful" (assertion failure, not import error)

Validation Before Transition

Pre-transition validation:
βœ“ Test written: [yes/no]
βœ“ Test executed: [yes/no]
βœ“ Test failed correctly: [yes - output above]
βœ“ Meaningful failure: [yes - justification]

Transitioning to RED.

Transitions

  • PLANNING β†’ RED (test fails correctly)
  • PLANNING β†’ BLOCKED (cannot write valid test)

State: RED

Prefix: πŸ”΄ TDD: RED

Purpose: Test IS failing. Implement ONLY what the error message demands.

Pre-Conditions

  • Test written and executed
  • Test IS FAILING correctly
  • Failure is meaningful (not setup/syntax error)

The Hardcode Rule

Before implementing, announce:

Minimal implementation check:
- Error demands: [what the error literally says]
- Could hardcoded value work? [yes/no]
- If yes: [what hardcoded value]
- If no: [why real logic is required]

| Error Says | Do This | |-----------|---------| | expected err('NOT_FOUND') | return err('NOT_FOUND') | | expected ok(user) | return ok(mockUser) | | expected count: 0 | return { count: 0 } |

Only add logic when tests FORCE you to.

Actions

  1. Read error message - what does it literally ask for?
  2. Announce minimal implementation check
  3. Implement ONLY what error demands (hardcode if possible)
  4. Run test
  5. VERIFY test PASSES (MANDATORY - see Verify GREEN below)
  6. Run typecheck (npm run typecheck)
  7. Run lint (npm run lint)
  8. Show all output verbatim
  9. Transition to GREEN

Verify GREEN - Watch It Pass

MANDATORY. Never skip.

npm test path/to/test.test.ts

Confirm:

  • Test passes
  • Other tests still pass
  • Output pristine (no errors, warnings)

Test fails? Fix code, not test. Other tests fail? Fix now.

Post-Conditions (ALL required)

  • [ ] Implemented ONLY what error demanded
  • [ ] Test executed and PASSES
  • [ ] Code compiles (no TypeScript errors)
  • [ ] Code lints (no ESLint errors)
  • [ ] All output shown verbatim

Validation Before Transition

Post-condition validation:
βœ“ Test PASSES: [yes - output above]
βœ“ Code compiles: [yes - output above]
βœ“ Code lints: [yes - output above]
βœ“ Implementation minimal: [justification]

Transitioning to GREEN.

Critical Rules

  • NEVER transition to GREEN without test PASS + compile + lint
  • NEVER anticipate future errors - address THIS error only
  • NEVER change test to match implementation - fix the code

Transitions

  • RED β†’ GREEN (test passes, code compiles, code lints)
  • RED β†’ BLOCKED (cannot make test pass)
  • RED β†’ PLANNING (test failure reveals misunderstood requirement)

State: GREEN

Prefix: 🟒 TDD: GREEN

Purpose: Test IS passing. Assess code quality and decide next step.

Pre-Conditions

  • Test exists and PASSES
  • Code compiles and lints
  • Implementation is minimal

Actions

  1. Review implementation against patterns:

| Pattern | Check | |---------|-------| | fn(args, deps) | Is deps type explicit and minimal? | | Result types | Are all error cases typed? | | Validation | Is Zod at boundary only? | | Naming | Domain-specific, not generic? |

  1. Decide: Does code need refactoring?
  2. If YES β†’ REFACTOR
  3. If NO β†’ VERIFY

Post-Conditions

  • [ ] Test IS PASSING
  • [ ] Code quality assessed
  • [ ] Decision made: refactor or verify

Transitions

  • GREEN β†’ REFACTOR (improvements needed)
  • GREEN β†’ VERIFY (code is clean)
  • GREEN β†’ RED (test starts failing - regression)

State: REFACTOR

Prefix: πŸ”΅ TDD: REFACTOR

Purpose: Tests ARE passing. Improve design while keeping green.

Pre-Conditions

  • Tests ARE PASSING
  • Refactoring needs identified

Refactor Checklist

// 1. Extract deps type
// BEFORE
async function getUser(
  args: { userId: string },
  deps: { db: Database; logger: Logger }
)

// AFTER
type GetUserDeps = { db: Database; logger: Logger };
async function getUser(args: { userId: string }, deps: GetUserDeps)

// 2. Type all errors
// BEFORE
return err('error')

// AFTER
type GetUserError = 'NOT_FOUND' | 'DB_ERROR';
return err<GetUserError>('NOT_FOUND')

// 3. Use domain names
// BEFORE
const data = await deps.db.get(id);

// AFTER
const user = await deps.db.findUser(userId);

Actions

  1. Apply ONE refactoring
  2. Run tests - verify still pass
  3. Repeat until no more improvements
  4. Transition to VERIFY

Post-Conditions

  • [ ] Deps type is explicit and exported
  • [ ] All error types are explicit in signature
  • [ ] Names are domain-specific
  • [ ] Tests still pass after each refactor

When to Skip REFACTOR

  • Code is already clean
  • Single hardcoded value (generalize when second test forces it)

Never skip when:

  • Deps type missing or inline
  • Error types not explicit
  • Generic names (data, util, result)

Transitions

  • REFACTOR β†’ VERIFY (code quality satisfactory)
  • REFACTOR β†’ RED (refactor broke test)
  • REFACTOR β†’ BLOCKED (cannot refactor due to constraints)

State: VERIFY

Prefix: 🟑 TDD: VERIFY

Purpose: Run full test suite + lint + build before claiming complete.

Actions

  1. Run full test suite: npm test
  2. Run lint: npm run lint
  3. Run typecheck: npm run typecheck
  4. Run build: npm run build (if applicable)
  5. Show ALL output verbatim
  6. If ALL pass β†’ COMPLETE
  7. If ANY fail β†’ route appropriately

Post-Conditions (ALL required)

  • [ ] Full test suite executed - ALL pass
  • [ ] Lint executed - passes
  • [ ] Typecheck executed - passes
  • [ ] Build executed - succeeds
  • [ ] All output shown

Validation Before Completion

Final validation:
βœ“ Full test suite: [X/X tests passed - output shown]
βœ“ Lint: [passed - output shown]
βœ“ Typecheck: [passed - output shown]
βœ“ Build: [succeeded - output shown]

TDD cycle COMPLETE.

Summary:
- Tests written: [count]
- Refactorings: [count]

Transitions

  • VERIFY β†’ COMPLETE (all checks pass)
  • VERIFY β†’ RED (tests fail - regression)
  • VERIFY β†’ REFACTOR (lint fails - code quality)
  • VERIFY β†’ BLOCKED (build fails - structural issue)

State: BLOCKED

Prefix: ⚠️ TDD: BLOCKED

Purpose: Handle situations where progress cannot continue.

Actions

  1. Clearly explain blocking issue
  2. Explain which state you were in
  3. Explain what you were trying to do
  4. Suggest possible resolutions
  5. STOP and wait for user guidance

Critical Rules

  • NEVER improvise workarounds
  • NEVER skip steps to "unblock" yourself
  • ALWAYS stop and wait for user

Transitions

  • BLOCKED β†’ [any state] (based on user guidance)

State: VIOLATION

Prefix: πŸ”₯ TDD: VIOLATION

Purpose: Handle state machine violations.

Triggers

  • Forgot state announcement
  • Skipped state
  • Failed to validate post-conditions
  • Claimed complete without evidence
  • Changed test assertion to match implementation
  • Implemented full solution when hardcode would work

Actions

  1. IMMEDIATELY announce: πŸ”₯ TDD: VIOLATION
  2. Explain which rule was violated
  3. Explain what you did wrong
  4. Announce correct current state
  5. Ask user permission to recover

Example

πŸ”₯ TDD: VIOLATION

Violation: Implemented full user lookup logic when hardcoded
value would satisfy the single test case.

Correct action: Return hardcoded { id: '123', name: 'Alice' }

Recovering to RED to fix implementation...

Test Structure for fn(args, deps)

import { describe, it, expect } from 'vitest';
import { mock } from 'vitest-mock-extended';
import { getUser, type GetUserDeps } from './get-user';

describe('getUser', () => {
  it('returns err NOT_FOUND when user does not exist', async () => {
    // Arrange: mock deps
    const deps = mock<GetUserDeps>();
    deps.db.findUser.mockResolvedValue(null);

    // Act
    const result = await getUser({ userId: '123' }, deps);

    // Assert Result type
    expect(result.ok).toBe(false);
    if (!result.ok) {
      expect(result.error).toBe('NOT_FOUND');
    }
  });

  it('returns ok with user when found', async () => {
    const mockUser = { id: '123', name: 'Alice' };
    const deps = mock<GetUserDeps>();
    deps.db.findUser.mockResolvedValue(mockUser);

    const result = await getUser({ userId: '123' }, deps);

    expect(result.ok).toBe(true);
    if (result.ok) {
      expect(result.value).toEqual(mockUser);
    }
  });
});

Meaningful vs Setup Failures

| Type | Example | Action | |------|---------|--------| | Meaningful | "expected err('NOT_FOUND') but got ok(user)" | Proceed to RED | | Meaningful | "expected false, received true" | Proceed to RED | | Setup | "Cannot find module './get-user'" | Fix import, stay in PLANNING | | Setup | "Property 'findUser' does not exist" | Fix mock, stay in PLANNING | | Setup | "TypeError: deps.db is undefined" | Fix setup, stay in PLANNING |

Red Flags - STOP and Start Over

If you encounter any of these, delete code and start over with TDD:

  • Code before test
  • Test after implementation
  • Test passes immediately (didn't watch it fail)
  • Can't explain why test failed
  • Tests added "later"
  • Rationalizing "just this once"
  • "I already manually tested it"
  • "Keep as reference" or "adapt existing code"
  • "Already spent X hours, deleting is wasteful"
  • "TDD is dogmatic, I'm being pragmatic"
  • "This is different because..."

All of these mean: Delete code. Start over with TDD.

Common Rationalizations

| Excuse | Reality | |--------|--------| | "Too simple to test" | Simple code breaks. Test takes 30 seconds. | | "I'll test after" | Tests passing immediately prove nothing. | | "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" | | "Already manually tested" | Ad-hoc β‰  systematic. No record, can't re-run. | | "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. | | "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. | | "Need to explore first" | Fine. Throw away exploration, start with TDD. | | "Test hard = design unclear" | Listen to test. Hard to test = hard to use. | | "TDD will slow me down" | TDD faster than debugging. Pragmatic = test-first. | | "Manual test faster" | Manual doesn't prove edge cases. You'll re-test every change. |

Thinking "skip TDD just this once"? Stop. That's rationalization.

Critical Rules Summary

| Rule | Enforcement | |------|-------------| | Iron Law | NO production code without failing test first | | Delete code rule | If code exists before test, delete it | | State announcement | EVERY message starts with state | | Verify RED | MANDATORY - watch test fail | | Verify GREEN | MANDATORY - watch test pass | | No green without proof | Must see test pass output | | Error message driven | Implement ONLY what error demands | | Hardcode first | One test? Return expected value | | Never change test | Fix implementation, not assertion | | Validate before transition | Check all post-conditions |

Never Use vi.mock() for App Logic

// WRONG: vi.mock creates brittle path coupling
vi.mock('../infra/database', () => ({ db: mockDb }));

// CORRECT: Inject deps, mock with vitest-mock-extended
const deps = mock<GetUserDeps>();
deps.db.findUser.mockResolvedValue(mockUser);

Quick Reference

| State | Prefix | Action | Exit Condition | |-------|--------|--------|----------------| | PLANNING | βšͺ | Write failing test | Test fails meaningfully | | RED | πŸ”΄ | Minimum implementation | Test passes + compile + lint | | GREEN | 🟒 | Assess quality | Decision: refactor or verify | | REFACTOR | πŸ”΅ | Improve design | Tests still pass, code clean | | VERIFY | 🟑 | Full suite + lint + build | All green | | BLOCKED | ⚠️ | Explain blocker, wait | User guidance | | VIOLATION | πŸ”₯ | Acknowledge, recover | Permission to continue |