Agent Skills: Pentest Exploit Validation

Proof-driven exploitation with 4-level evidence system, bypass exhaustion protocol, mandatory evidence checklists, and strict EXPLOITED/POTENTIAL/FALSE_POSITIVE classification.

UncategorizedID: plurigrid/asi/pentest-exploit-validation

Install this agent skill to your local

pnpm dlx add-skill https://github.com/plurigrid/asi/tree/HEAD/plugins/asi/skills/pentest-exploit-validation

Skill Files

Browse the full folder contents for pentest-exploit-validation.

Download Skill

Loading file tree…

plugins/asi/skills/pentest-exploit-validation/SKILL.md

Skill Metadata

Name
pentest-exploit-validation
Description
Proof-driven exploitation with 4-level evidence system, bypass exhaustion protocol, mandatory evidence checklists, and strict EXPLOITED/POTENTIAL/FALSE_POSITIVE classification.

Pentest Exploit Validation

Purpose

Validate vulnerability findings through proof-driven exploitation using Shannon's 4-level evidence system. Consumes the exploitation queue from white-box code review, attempts structured exploitation with bypass exhaustion, collects mandatory evidence per vulnerability type, and classifies each finding as EXPLOITED, POTENTIAL, or FALSE_POSITIVE.

Prerequisites

Authorization Requirements

  • Written authorization with explicit scope for active exploitation testing
  • Exploitation queue JSON from pentest-whitebox-code-review output
  • Test accounts at multiple privilege levels for authz testing
  • Data exfiltration approval — confirm acceptable proof-of-concept scope
  • Rollback plan for any data-mutating exploits

Environment Setup

  • sqlmap for automated SQL injection exploitation
  • Burp Suite Professional with Repeater, Intruder, and Turbo Intruder
  • curl for manual HTTP request crafting
  • Playwright for browser-based exploitation (XSS, CSRF)
  • nuclei with custom templates for automated validation
  • Isolated testing environment or explicit production testing approval

Core Workflow

  1. Queue Intake: Parse exploitation queue JSON, validate schema, prioritize by confidence score and impact severity. Group findings by vulnerability type for parallel exploitation.
  2. Injection Exploitation: Confirm injectable parameter → fingerprint backend (DB type, OS) → enumerate databases/tables → demonstrate data exfiltration with minimal footprint.
  3. XSS Exploitation: Graph traversal from source → processing → sanitization → sink. Craft context-appropriate payload, demonstrate session hijack or DOM manipulation.
  4. Auth Exploitation: Attack authentication weaknesses → demonstrate account takeover via credential stuffing, token forgery, or session hijack.
  5. Authz Exploitation: Horizontal access (cross-user data) → vertical escalation (admin functions) → workflow bypass (state manipulation).
  6. SSRF Exploitation: Internal service access → cloud metadata retrieval (169.254.169.254) → internal network reconnaissance.
  7. Bypass Exhaustion: For each finding, attempt 3 initial payloads → if blocked, escalate to 8-10 bypass variations → if still blocked, deploy automated tool variants.
  8. Impact Escalation: Escalate from proof-of-concept to real impact demonstration — data exfiltration, session hijacking, or remote code execution.
  9. Evidence Collection: Collect mandatory evidence per vulnerability type using per-type checklists.
  10. Classification: Assign final classification — EXPLOITED, POTENTIAL, or FALSE_POSITIVE — based on 4-level proof system.

4-Level Proof System

| Level | Description | Classification | |-------|-------------|---------------| | L1 | Weakness identified in code but not confirmed exploitable | POTENTIAL | | L2 | Partial bypass achieved but full exploitation not demonstrated | POTENTIAL | | L3 | Vulnerability confirmed with reproducible evidence | EXPLOITED | | L4 | Critical impact demonstrated (data exfil, RCE, account takeover) | EXPLOITED CRITICAL |

Classification Criteria

| Classification | Criteria | |---------------|----------| | EXPLOITED | Reproducible proof with evidence: HTTP request/response, extracted data, or demonstrated impact | | POTENTIAL | Code-level weakness confirmed but exploitation blocked by defense-in-depth or environment constraints | | FALSE_POSITIVE | Taint analysis flagged but manual review confirms effective sanitization or unreachable code path |

Tool Categories

| Category | Tools | Purpose | |----------|-------|---------| | SQL Injection | sqlmap, manual payloads | Automated and manual SQLi exploitation | | Request Crafting | Burp Repeater, curl | Manual HTTP request manipulation | | Fuzzing | Burp Intruder, Turbo Intruder | Payload variation and bypass testing | | Browser Exploitation | Playwright | XSS demonstration, session hijack | | Automation | nuclei, custom scripts | Template-based vulnerability validation | | Evidence Capture | Burp Logger, screenshot tools | Request/response logging and proof |

References

  • references/tools.md - Tool function signatures and parameters
  • references/workflows.md - Exploitation workflows, evidence checklists, and classification tree