Systematic Debugging
Four-phase scientific method for root-cause analysis. Integrates with slow-is-fast reasoning principles.
Iron Law: No fix without verified root cause. Never say "this should fix it."
Phase 1: Root Cause Investigation
1.1 Collect Symptoms
Gather all available evidence before forming any hypothesis:
- Read the full error message/stack trace
- Identify when it started (check
git log --oneline -20for recent changes) - Determine reproduction conditions (always? intermittent? environment-specific?)
- Check if there are related error reports or failing tests
1.2 Reproduce
A bug you can't reproduce is a bug you can't fix.
- Write down the exact reproduction steps
- Confirm you can trigger the failure consistently
- If intermittent: identify the variable (timing, data, concurrency, environment)
1.3 Narrow the Scope
Use binary search thinking:
- Bisect in time:
git log --oneline— when did it last work? - Bisect in space: Which module/layer? Add strategic logging or assertions
- Bisect in data: Which inputs trigger it? Minimal reproduction case
1.4 Verify, Don't Recall
Never state environment details from memory. Before diagnosing OS, compiler, SDK, runtime, or tool version issues, run the detection command first:
sw_vers # macOS version
xcodebuild -version # Xcode / SDK
node --version # Node
rustc --version # Rust
State the actual output. A diagnosis built on an assumed version is not a diagnosis.
External tool / MCP failure: diagnose before switching. When an MCP tool, CLI dependency, or external API is unavailable or returning errors, do not immediately try an alternative method. First determine why it failed:
- Is the server running?
- Is the API key valid or expired?
- Is the config pointing to the right endpoint?
- Is a proxy needed?
Switching to a workaround without diagnosing the original failure leaves the real bug intact and wastes the next session too.
Output after Phase 1:
Root cause hypothesis: <one sentence>
Evidence: <what supports this>
Confidence: <low/medium/high>
Phase 2: Pattern Analysis
Match against known bug patterns before investigating from scratch:
| Pattern | Signature | Where to Look | |---------|-----------|---------------| | Race condition | Intermittent, timing-dependent, "works on retry" | Shared state, async operations, missing locks/awaits | | Nil/null propagation | Crash on property access, "undefined is not a function" | Optional chaining gaps, missing null checks at boundaries | | State corruption | Wrong value at wrong time, stale data | Mutable shared state, cache invalidation, stale closures | | Integration failure | Works in isolation, fails in composition | API contract mismatch, version skew, env config | | Config drift | Works locally, fails in CI/staging/prod | Environment variables, feature flags, dependency versions | | Stale cache | Old behavior persists after fix, "but I already changed that" | Build cache, module cache, CDN, browser cache, ORM cache | | Test pollution | Test passes alone, fails in suite (or vice versa) | Shared global state, missing cleanup, execution order |
For Flaky Tests
Use the polluter-finding approach:
- Run the failing test in isolation — does it pass?
- If yes: another test is polluting shared state
- Binary search the test suite to find the polluter:
# Run first half of suite + failing test # If fails: polluter is in first half. Recurse. # If passes: polluter is in second half. Recurse. - Once found: fix the shared state leak, don't just reorder tests
Phase 3: Hypothesis Testing
Scientific Method
For each hypothesis:
- Predict: "If hypothesis X is correct, then Y should be true"
- Test: Design a minimal experiment that confirms or refutes
- Observe: Record actual result — no interpretation yet
- Conclude: Does evidence support or refute? Update hypothesis
Single variable: Change only one thing per test. Multiple changes = uninterpretable results.
3-Strike Rule
After 3 failed hypotheses:
- STOP. Do not try a 4th fix.
- Reassess: you likely have the wrong mental model of the system
- Ask: "What assumption am I making that could be wrong?"
- Consider: is this an architectural issue, not a bug?
- Present findings to user and ask for guidance
Same Symptom After Fix = Hard Stop
If the user reports the same symptom after a patch was applied, do not patch again. Treat it as a new investigation:
- The previous hypothesis was wrong — discard it completely
- Re-read the execution path from scratch; do not continue from the prior mental model
- Three rounds of "fixed but still broken" in the same area means the abstraction is wrong, not the specific line
Self-regulation triggers — stop and reassess if you catch yourself:
- Reverting the same area twice
- A "single bug" fix touching more than 3 files
- Each fix surfacing a new problem in a different module
- Writing the fix before finishing the trace
Rationalization Watch
These phrases are diagnostic failures in disguise. When one surfaces, stop and re-examine:
| What you're thinking | What it actually means | Required action | |---|---|---| | "I'll just try this one thing" | No hypothesis, random-walking | Stop. Write the hypothesis first. | | "I'm confident it's X" | Confidence is not evidence | Run an instrument that proves it. | | "Probably the same issue as before" | Treating a new symptom as a known pattern | Re-read the execution path from scratch. | | "It works on my machine" | Environment difference IS the bug | Enumerate every env diff before dismissing it. | | "One more restart should fix it" | Avoiding the error message | Read the last error verbatim. Never restart more than twice without new evidence. | | "Fix works but I don't understand why" | Coincidental success, root cause unknown | Revert the fix. Resume hypothesis testing. | | "It's a framework bug" | Likely your code, not the dependency | Reproduce in isolation against your own code first. |
Phase 4: Implementation
Fix Root Cause, Not Symptoms
- Wrong: Adding a null check where null shouldn't be possible
- Right: Fixing the code path that produces null
Minimal Diff
- Change only what's necessary for the fix
- No drive-by refactoring during bug fixes
- No "while I'm here" improvements
Regression Test
- Write a test that fails with the bug present
- Apply the fix
- Verify the test passes
- Run the full test suite — no regressions
Verification Report
After fix is applied, output:
## Debug Report
Symptom: <what was observed>
Root cause: <what actually caused it>
Evidence: <how we confirmed>
Fix: <what was changed and why>
Regression test: <test name/location>
Hypotheses tested: <N> (<list with outcomes>)
Confidence: <high — verified | medium — likely but edge cases possible>
Anti-Patterns
- Shotgun debugging: Changing random things hoping something works
- Fix-and-pray: Applying a fix without understanding the cause
- Symptom patching: Hiding the bug instead of fixing it (e.g., try/catch swallowing errors)
- Blame the framework: Assuming the bug is in a dependency before checking your own code
- Stale hypothesis: Continuing to pursue a theory after evidence contradicts it
- Over-logging: Adding 50 log statements instead of thinking about the problem
- Diagnosing from memory: Asserting OS / runtime / SDK version state without running the detection command first
- Workaround over diagnosis: Switching to an alternative tool or path when one fails, without finding out why the original failed