Systematic Debugging Protocol Skill

Systematic Debugging Protocol

A disciplined, evidence-based approach to debugging that prevents guessing and ensures root cause discovery.

The 4-Phase Protocol

Phase 1: REPRODUCE (Establish Ground Truth)

Goal: Create reliable reproduction steps before ANY investigation.

Actions:

Document exact steps to trigger the bug
Record environment specifics (OS, versions, config, memory, network)
Determine frequency: Always? Sometimes? Specific conditions?
Capture exact error messages, stack traces, screenshots
Test on different environments to isolate variables

Key Questions:

When did it last work correctly?
What changed since then? (code, deps, config, infrastructure)
Is it environment-specific?
Is it data-specific?
Is it timing-specific?

Output: Clear reproduction steps that reliably trigger the issue.

Phase 2: ISOLATE (Narrow the Scope)

Goal: Reduce the search space from "entire codebase" to "specific component."

Techniques:

Binary Search:

Identify two points: working state and broken state
Test the midpoint
Recurse into the broken half
Continue until the change is identified

Git Bisect (for regressions):

git bisect start
git bisect bad HEAD
git bisect good <known-good-commit>
# Git will checkout commits for testing
# After each test:
git bisect good  # or git bisect bad
# Continue until culprit found

Code Elimination:

Comment out sections to isolate the problem
Create minimal reproduction case
Strip away everything non-essential

Environment Isolation:

Test in isolation (unit test the failing path)
Compare working vs broken environments
Use fresh installs to eliminate pollution

Output: "The bug is in [specific component/function/line range]"

Phase 3: DIAGNOSE (Understand Root Cause)

Goal: Know exactly WHY the bug occurs, not just WHERE.

Scientific Method:

Observe: What exactly is happening?
Hypothesize: Why might this be happening?
Predict: If hypothesis is correct, what else would be true?
Test: Verify predictions with evidence
Iterate: Refine hypothesis based on results

Logging Strategy:

// Add strategic logging at boundaries
console.log("[DEBUG] Function entry:", { input, state });
console.log("[DEBUG] After processing:", { result, sideEffects });
console.log("[DEBUG] Function exit:", { returnValue });

Common Root Causes:

| Symptom | Likely Causes | | -------------------------- | -------------------------------------------------- | | Works locally, fails in CI | Environment differences, timing, resources | | Intermittent failure | Race condition, flaky network, resource contention | | Works then stops working | State mutation, memory leak, cache poisoning | | Wrong data | Type coercion, encoding, timezone, precision | | Silent failure | Swallowed exception, async error, missing await |

Output: Clear explanation of the root cause with evidence.

Phase 4: FIX & VERIFY (Resolve and Prevent)

Goal: Fix the issue and prevent regression.

Fix Process:

Write a failing test that captures the bug
Implement minimal fix - change as little as possible
Verify test passes - confirms fix works
Check for similar patterns - same bug elsewhere?
Review fix for side effects - does it break anything?
Document the fix - why it happened, how to prevent

Verification Checklist:

[ ] Test passes that specifically catches this bug
[ ] Existing tests still pass
[ ] Manual verification confirms fix
[ ] Fix works in all affected environments
[ ] No new warnings or errors introduced

Prevention:

Add guards/validation at boundaries
Improve error messages for easier future debugging
Document gotchas for other developers
Consider if architectural change prevents similar bugs

Debugging Anti-Patterns

DO NOT:

Guess and hope (change things randomly)
Assume you know the problem without evidence
Trust comments/docs over actual code behavior
Debug production with print statements you'll forget to remove
Fix the symptom instead of the root cause
Make multiple changes at once

DO:

Verify assumptions with evidence
Change one thing at a time
Log actual values, not what you expect
Trust the code over documentation
Take breaks when stuck (fresh eyes help)

Quick Reference

1. REPRODUCE → Can I reliably trigger this?
2. ISOLATE   → Where exactly is it failing?
3. DIAGNOSE  → Why is it failing?
4. FIX       → How do I fix it permanently?

Output Template

## Bug Investigation: [Title]

### Reproduction

- Steps to reproduce
- Environment details
- Frequency

### Isolation

- Search method used
- Scope narrowed to

### Root Cause

- What's actually wrong
- Why it happens
- Evidence

### Fix

- Code changes made
- Test added

### Prevention

- How to prevent similar bugs
- Documentation updates

Agent Skills: Systematic Debugging Protocol

Install this agent skill to your local

Skill Files