Root Cause Analysis
Delegated investigation: symptom → hypothesis → elimination → root cause → prevention.
Steps
- Load the
outfitter:debuggingskill for systematic investigation - Apply elimination techniques from this skill's references
- Document investigation trail using RCA templates
- Deliver root cause report with prevention recommendations
<when_to_use>
- Diagnosing system failures or unexpected behavior
- Investigating incidents or outages
- Finding the actual cause vs surface symptoms
- Preventing recurrence through understanding
- Post-incident reviews requiring formal documentation
NOT for: known issues with documented fixes, simple configuration errors, routine debugging (use debugging skill directly)
</when_to_use>
<rca_focus>
This skill extends debugging with formal RCA practices:
| Aspect | Debugging | Root Cause Analysis | |--------|-----------|---------------------| | Scope | Fix the immediate issue | Understand why it happened | | Output | Working code | RCA report + prevention | | Documentation | Investigation notes | Formal templates | | Goal | Resolution | Prevention of recurrence |
Use debugging for day-to-day bug fixes. Use find-root-causes for incidents requiring formal investigation and documentation.
</rca_focus>
<elimination_techniques>
Three core techniques for narrowing to root cause:
| Technique | When to Use | Method | |-----------|-------------|--------| | Binary Search | Large problem space, ordered changes | Bisect the change range | | Variable Isolation | Multiple variables, need causation | Control all but one | | Process of Elimination | Finite set of possible causes | Rule out systematically |
See elimination-techniques.md for detailed methods and examples.
</elimination_techniques>
<documentation>Investigation Trail
Log every step for handoff and pattern recognition:
[TIME] STAGE: Action → Result
[10:15] DISCOVERY: Gathered error logs → Found NullPointerException
[10:22] HYPOTHESIS: User object not initialized
[10:28] TEST: Added null check logging → Confirmed user is null
RCA Report Structure
- Summary — one-sentence root cause
- Timeline — events leading to incident
- Impact — what was affected, duration
- Root Cause — why it happened (not just what)
- Contributing Factors — conditions that enabled it
- Prevention — changes to prevent recurrence
- Detection — how to catch it earlier next time
See documentation-templates.md for full templates.
</documentation><common_pitfalls>
| Trap | Counter | |------|---------| | "I already looked at that" | Re-examine with fresh evidence | | "That can't be the issue" | Test anyway, let evidence decide | | "We need to fix this quickly" | Methodical investigation is faster | | Confirmation bias | Actively seek disconfirming evidence | | Correlation = causation | Test direct causal mechanism |
See pitfalls.md for detailed resistance patterns and recovery.
</common_pitfalls>
<rules>ALWAYS:
- Load debugging skill for systematic investigation methodology
- Use elimination techniques to narrow root cause
- Document investigation trail as you go
- Produce formal RCA report for incidents
- Include prevention recommendations
- Identify contributing factors, not just root cause
NEVER:
- Skip formal documentation for incidents
- Stop at "what happened" without "why"
- Propose fixes without understanding root cause
- Omit prevention recommendations
- Blame individuals (focus on systems)
- elimination-techniques.md — binary search, variable isolation, process of elimination
- pitfalls.md — cognitive biases and resistance patterns
- documentation-templates.md — investigation logs and RCA reports