bug-hunt Skill | Agent Skills

bug-hunt

A correctness sweep that only reports bugs it failed to refute. Finders generate candidates; verifiers try to kill them; survivors make the report. The single biggest failure mode of agent bug-hunting is plausible-but-wrong findings, so verification is not optional.

Read-only. Finding bugs and fixing them are separate engagements.

Lenses

Sweep with each lens. With parallel subagents available, one finder per lens; otherwise sequential passes.

| Lens | Hunting for | |------|-------------| | Logic | Inverted conditions, off-by-one, wrong operator, unreachable branches, broken invariants | | Error handling | Swallowed exceptions, missing error paths, errors that corrupt state before propagating, misleading messages | | Edge cases | Empty/nil/zero inputs, unicode, huge inputs, boundary values, first/last iteration | | Concurrency | Races, missing locks, shared mutable state, TOCTOU, async ordering assumptions | | API misuse | Contract violations against libraries and the project's own interfaces, ignored return values, resource leaks, lifecycle errors |

Focus finders on code that is reachable and load-bearing: entry points, hot paths, recently changed files (git log --since is a good prior). A bug in dead code is info, not a finding.

Verification (mandatory)

Every candidate gets an adversarial pass before it may appear in the report. The verifier's job is to REFUTE the finding, default skeptical:

Read the actual code path end to end, including callers and guards the finder may have missed.
Trace a concrete input that triggers the bug. No trigger, no bug.
Check whether a test, type system, or runtime check already prevents it.
Verdict: confirmed (with the triggering scenario), refuted (drop silently), or unverifiable (report downgraded one severity, marked (unverified)).

When tests can be run safely (no external dependencies, sandboxed), a failing reproduction test is the gold standard for confirmation and should be included in the finding as a sketch, not committed.

Report contract

Same spine as line-check so findings compose. Severity: critical (data loss, corruption, security-adjacent) / high (wrong results on common inputs, crashes) / medium (wrong on edge cases) / low (latent, needs unlikely conditions) / info. Effort is the fix cost: S / M / L.

# bug-hunt report: <repo> (<date>)

## Verdict
Paragraph: overall correctness posture, the scariest confirmed bug.

## Scorecard
| Lens | Score (0-5) | Summary |

## Findings
### [SEVERITY] Short imperative title
- **Lens:** which lens found it
- **Where:** file:line
- **What:** the defect, concretely
- **Trigger:** the concrete input or sequence that hits it
- **Why it matters:** consequence
- **Fix:** specific action
- **Effort:** S / M / L

## Backlog
Numbered, leverage-sorted: `N. [SEVERITY/EFFORT] title (lens)`

## Not checked
Lenses or areas skipped and why; candidates that were refuted (count only).

Common mistakes

Reporting finder output without verification. Half of plausible candidates die under a skeptical read.
"This could be a problem if..." findings with no trigger. A bug without a triggering input is a hypothesis.
Treating style issues as bugs. Wrong formatting never corrupted data.
Stopping at the first confirmed bug in a file. Bugs cluster; finish the file.

Agent Skills: bug-hunt

Install this agent skill to your local

Skill Files

bug-hunt

Lenses

Verification (mandatory)

Report contract

Common mistakes