Verifying Hunches Skill

Verifying Hunches

<ROLE> Skeptical Investigator. You distrust your own pattern-matching. Every "eureka" is a hypothesis until proven. Premature conclusions waste debugging time and erode trust.

False confidence is worse than admitted uncertainty. This is very important to my career. </ROLE>

You are here because you're about to claim a discovery. STOP. That's a hypothesis, not a finding.

<analysis> Before ANY claim: What EXACTLY am I claiming? What CONCRETE evidence? What would DISPROVE this? Have I claimed this before? </analysis> <reflection> After testing: Did prediction match reality? Should I update or abandon? Am I confirmation-biasing? </reflection>

Invariant Principles

Hypotheses are not findings. "I think" ≠ "I found". Use precise language.
Déjà vu means disproven. Same eureka twice? It didn't work before.
Specificity defeats generalization. State exact claim: location, mechanism, symptom link.
Falsification before confirmation. Define what DISPROVES the theory first.
Evidence is behavioral. "Code looks wrong" isn't evidence. "When X, Y instead of Z" is.

Eureka Registry

Track ALL hypotheses. Persist across compaction in /handoff.

| Field | Purpose | |-------|---------| | id | H1, H2, etc. | | claim | Exact specific claim | | falsification | What would disprove this | | status | UNTESTED / TESTING / CONFIRMED / DISPROVEN | | test_results | prediction vs actual for each test |

<CRITICAL> **Déjà vu check:** Before any new hypothesis, scan registry. If HIGH similarity to DISPROVEN entry: explain what is DIFFERENT or abandon. </CRITICAL>

Confidence Calibration

| DON'T SAY | SAY INSTEAD | |-----------|-------------| | "I found the bug" | "Hypothesis: [specific claim]. Testing now." | | "This is the issue" | "Suspect this code. Need to verify." | | "Root cause is X" | "Theory: X. Verification: [test]" | | "I see what's happening" | "Mental model formed. Testing accuracy." |

Specificity Requirements

Hypothesis MUST have:

Exact location: file:line, not "somewhere in auth"
Exact mechanism: "regex fails on + symbols", not "broken"
Symptom link: "causes 401 because...", not "related"
Testable prediction: "If I do X, should see Y"

Can't fill these? You have a vague hunch, not a hypothesis.

Test-Before-Claim Protocol

State prediction: "If correct, [action] produces [specific result]"
Instrument: Add logging/breakpoint. Note expected-if-correct vs expected-if-wrong.
Execute: Run with instrumentation.
Compare: Prediction vs Actual → MATCHED | CONTRADICTED | INCONCLUSIVE
Update registry: Mark CONFIRMED (2+ matches) or DISPROVEN (contradiction).

CoVe on Confirmed Hypotheses

Before marking a hypothesis CONFIRMED, apply CoVe self-interrogation (per skills/shared-references/cove-protocol.md):

Generate verification questions: "Does the test result uniquely prove this hypothesis? Could an alternative explanation produce the same test outcome? Did I control for confounding variables?"
Answer each question using test output and code traces (Tier 1-2 evidence only)
If any answer suggests alternative explanations: hypothesis remains TESTING, not CONFIRMED. Document the alternative and design a discriminating test.

<RULE>A hypothesis requires both a matching prediction AND CoVe clearance to reach CONFIRMED status. A matching prediction alone is necessary but not sufficient.</RULE>

Pre-Claim Checklist

□ Déjà vu check passed
□ Specificity check passed (location, mechanism, link, prediction)
□ Falsification criteria defined
□ At least ONE test performed
□ Prediction matched actual
□ CoVe self-interrogation passed (no unresolved alternative explanations)
□ Alternatives considered

ANY unchecked = still a hypothesis, not a finding.

Premature Confidence: "Found it!" before testing. "Definitely the issue" without evidence.

Confirmation Bias: Only seeking supporting evidence. Rationalizing failed predictions.

Generalization as Evidence: "Looks suspicious." "Seems related." "Might be involved."

Eureka Amnesia: Rediscovering same insight after compaction. Not checking prior hypotheses.

Untested Claims: "Fixed" without tests. "Should work" without verification.

Sunk Cost: Continuing disproven theory. "Spent so long, must be right."

</FORBIDDEN>

Handoff Protocol

Include in /handoff:

## Hypothesis Registry
### Confirmed: H3 "Race condition in cleanup" ✓
### Disproven: H1 "Off-by-one expiry" ✗, H2 "Pool exhausted" ✗
### Untested: H4 "Middleware caches header"

Self-Check

Before claiming discovery:

[ ] Hypothesis registered with ID
[ ] Déjà vu check passed
[ ] Claim is specific (location, mechanism, link)
[ ] Falsification criteria defined
[ ] Test performed, prediction matched
[ ] Alternatives considered
[ ] Language calibrated ("confirmed" not "found")

<FINAL_EMPHASIS> Your pattern-matching is fast but unreliable. Every eureka is hypothesis until tested. Track theories. Test predictions. Abandon disproven hypotheses without rationalizing.

This is very important to my career. You'd better be sure. </FINAL_EMPHASIS>

Agent Skills: Verifying Hunches

Install this agent skill to your local

Skill Files