Agent Skills: Systematic Debugging Protocol

Systematic 4-phase debugging methodology for complex, intermittent, or mysterious issues. Use when investigating bugs, race conditions, or unexplained failures.

UncategorizedID: travisjneuman/.claude/debug-systematic

Install this agent skill to your local

pnpm dlx add-skill https://github.com/travisjneuman/.claude/tree/HEAD/skills/debug-systematic

Skill Files

Browse the full folder contents for debug-systematic.

Download Skill

Loading file tree…

skills/debug-systematic/SKILL.md

Skill Metadata

Name
debug-systematic
Description
Systematic 4-phase debugging methodology for complex, intermittent, or mysterious issues. Use when investigating bugs, race conditions, or unexplained failures.

Systematic Debugging Protocol

A disciplined, evidence-based approach to debugging that prevents guessing and ensures root cause discovery.

The 4-Phase Protocol

Phase 1: REPRODUCE (Establish Ground Truth)

Goal: Create reliable reproduction steps before ANY investigation.

Actions:

  1. Document exact steps to trigger the bug
  2. Record environment specifics (OS, versions, config, memory, network)
  3. Determine frequency: Always? Sometimes? Specific conditions?
  4. Capture exact error messages, stack traces, screenshots
  5. Test on different environments to isolate variables

Key Questions:

  • When did it last work correctly?
  • What changed since then? (code, deps, config, infrastructure)
  • Is it environment-specific?
  • Is it data-specific?
  • Is it timing-specific?

Output: Clear reproduction steps that reliably trigger the issue.


Phase 2: ISOLATE (Narrow the Scope)

Goal: Reduce the search space from "entire codebase" to "specific component."

Techniques:

Binary Search:

  1. Identify two points: working state and broken state
  2. Test the midpoint
  3. Recurse into the broken half
  4. Continue until the change is identified

Git Bisect (for regressions):

git bisect start
git bisect bad HEAD
git bisect good <known-good-commit>
# Git will checkout commits for testing
# After each test:
git bisect good  # or git bisect bad
# Continue until culprit found

Code Elimination:

  • Comment out sections to isolate the problem
  • Create minimal reproduction case
  • Strip away everything non-essential

Environment Isolation:

  • Test in isolation (unit test the failing path)
  • Compare working vs broken environments
  • Use fresh installs to eliminate pollution

Output: "The bug is in [specific component/function/line range]"


Phase 3: DIAGNOSE (Understand Root Cause)

Goal: Know exactly WHY the bug occurs, not just WHERE.

Scientific Method:

  1. Observe: What exactly is happening?
  2. Hypothesize: Why might this be happening?
  3. Predict: If hypothesis is correct, what else would be true?
  4. Test: Verify predictions with evidence
  5. Iterate: Refine hypothesis based on results

Logging Strategy:

// Add strategic logging at boundaries
console.log("[DEBUG] Function entry:", { input, state });
console.log("[DEBUG] After processing:", { result, sideEffects });
console.log("[DEBUG] Function exit:", { returnValue });

Common Root Causes:

| Symptom | Likely Causes | | -------------------------- | -------------------------------------------------- | | Works locally, fails in CI | Environment differences, timing, resources | | Intermittent failure | Race condition, flaky network, resource contention | | Works then stops working | State mutation, memory leak, cache poisoning | | Wrong data | Type coercion, encoding, timezone, precision | | Silent failure | Swallowed exception, async error, missing await |

Output: Clear explanation of the root cause with evidence.


Phase 4: FIX & VERIFY (Resolve and Prevent)

Goal: Fix the issue and prevent regression.

Fix Process:

  1. Write a failing test that captures the bug
  2. Implement minimal fix - change as little as possible
  3. Verify test passes - confirms fix works
  4. Check for similar patterns - same bug elsewhere?
  5. Review fix for side effects - does it break anything?
  6. Document the fix - why it happened, how to prevent

Verification Checklist:

  • [ ] Test passes that specifically catches this bug
  • [ ] Existing tests still pass
  • [ ] Manual verification confirms fix
  • [ ] Fix works in all affected environments
  • [ ] No new warnings or errors introduced

Prevention:

  • Add guards/validation at boundaries
  • Improve error messages for easier future debugging
  • Document gotchas for other developers
  • Consider if architectural change prevents similar bugs

Debugging Anti-Patterns

DO NOT:

  • Guess and hope (change things randomly)
  • Assume you know the problem without evidence
  • Trust comments/docs over actual code behavior
  • Debug production with print statements you'll forget to remove
  • Fix the symptom instead of the root cause
  • Make multiple changes at once

DO:

  • Verify assumptions with evidence
  • Change one thing at a time
  • Log actual values, not what you expect
  • Trust the code over documentation
  • Take breaks when stuck (fresh eyes help)

Quick Reference

1. REPRODUCE → Can I reliably trigger this?
2. ISOLATE   → Where exactly is it failing?
3. DIAGNOSE  → Why is it failing?
4. FIX       → How do I fix it permanently?

Output Template

## Bug Investigation: [Title]

### Reproduction

- Steps to reproduce
- Environment details
- Frequency

### Isolation

- Search method used
- Scope narrowed to

### Root Cause

- What's actually wrong
- Why it happens
- Evidence

### Fix

- Code changes made
- Test added

### Prevention

- How to prevent similar bugs
- Documentation updates