Skill Creator (with Superpowers Enforcement)
This skill wraps the built-in skill-creator:skill-creator with enforcement pattern awareness from the superpowers framework. It adds an enforcement audit layer to the skill-creator's draft-test-iterate loop.
When This Skill Applies
All skill creation and improvement work. This skill loads instead of the built-in skill-creator because it adds enforcement awareness that the built-in version lacks.
Process
Step 1: Classify the Skill
Before drafting, classify the skill being created:
| Type | Description | Enforcement Needs | |------|-------------|-------------------| | Workflow skill | Multi-phase process (like /dev, /ds, /writing) | High — needs Iron Laws, gates, rationalization tables | | Tool skill | Wraps a tool or API (like readwise, wrds, bluebook) | Medium — needs Red Flags for common misuse | | Knowledge skill | Domain knowledge reference (like ai-anti-patterns) | Low — needs trigger-only descriptions |
This classification determines how much enforcement audit to apply after each draft.
Step 2: Invoke the Built-in Skill Creator
Use the Skill tool to invoke the built-in skill-creator:
Skill(skill="skill-creator:skill-creator")
Follow its full process: capture intent, interview, draft SKILL.md, write test cases, run evals, iterate. The built-in skill-creator handles the eval loop — do not reimplement it.
Step 3: Enforcement Audit (After Each Draft)
After writing or revising the skill draft (and before running test cases), audit it against the superpowers enforcement patterns. Read the enforcement checklist:
Read("../../lib/references/enforcement-checklist.md") # relative to this skill's base directory
Then score the draft using the process below.
For Workflow Skills (High Enforcement)
Score against all 12 patterns. Use the scoring template from the checklist. Focus on:
-
Iron Laws — Does the skill have absolute constraints for high-drift actions? Are they wrapped in
<EXTREMELY-IMPORTANT>tags with strong framing? If they use soft language ("try to", "should", "consider"), they will be ignored — rewrite with action-masking language. -
Rationalization Tables — Does the skill preempt the agent's excuses? The table must contain actual excuses the agent generates, not hypothetical ones. Observe failure modes in test runs, then add entries.
-
Red Flags + STOP — Are there pattern interrupts for observable wrong actions? Must target actions ("About to X"), not intentions ("Thinking about X").
-
Gate Functions — Does every phase transition have a verifiable exit condition? "Quality is sufficient" is not a gate. "File X contains string Y" is a gate.
-
Trigger-Only Descriptions — Does the description contain ONLY trigger phrases? If it contains a process summary, the agent will follow the short description instead of reading the body. This is the single most common skill design mistake.
-
Drive-Aligned Framing — Do verification steps use helpfulness-first framing? "Skipping X is NOT HELPFUL — [concrete user harm]" is stronger than "incorrect" or "premature" because it targets the model's strongest drive.
-
Skill Dependencies — Does each phase explicitly read and invoke the next phase? Without explicit chaining, the agent will stop and wait.
-
No Pause Between Tasks — Does the skill prevent "should I continue?" between tasks?
-
Delete & Restart — For protocol violations, does the skill mandate deletion of contaminated work?
-
Staged Review Loops — Do implementation sections have review loops with iteration limits?
-
Flowcharts as Spec — For complex processes, is there an ASCII diagram that serves as the authoritative definition?
Critical gaps = High-drift action + Absent/Weak enforcement. Fix these before running evals.
For Tool Skills (Medium Enforcement)
Score against patterns 2, 3, 5, and 10:
- Rationalization Tables — What are common misuse patterns? (e.g., using the wrong API endpoint, skipping authentication)
- Red Flags + STOP — What wrong actions can the agent take? (e.g., calling a destructive API without confirmation)
- Trigger-Only Descriptions — Keep description to triggers only
- Staged Review Loops — For multi-step tool interactions, add review after each step
For Knowledge Skills (Low Enforcement)
Score against pattern 5 only:
- Trigger-Only Descriptions — This is the most important pattern for knowledge skills. If the description summarizes the knowledge, the agent reads the summary instead of the full body.
Step 4: Reconcile Tensions
The built-in skill-creator's writing advice and superpowers enforcement patterns have a genuine tension:
| skill-creator says | superpowers says | Resolution |
|---|---|---|
| "Explain the why, avoid heavy-handed MUSTs" | "Iron Laws use strongest framing available" | Both are right for different contexts. Use "explain the why" for standalone instructions. Use Iron Laws for high-drift actions where the agent will rationalize shortcuts. |
| "Keep the prompt lean" | "Add Rationalization Tables, Red Flags" | Enforcement patterns go in the skill body, not the description. Progressive disclosure keeps it lean — move detailed tables to references/ if SKILL.md exceeds 500 lines. |
| "Generalize from feedback, don't overfit" | "Observe failure modes, add entries to tables" | Rationalization Tables ARE generalization. Each entry captures a class of failures, not a specific test case. |
When the built-in skill-creator suggests removing enforcement patterns because they're "not pulling their weight" or are "oppressively constrictive MUSTs," push back if the pattern addresses a real observed failure mode. The test: did an agent actually take the shortcut this pattern prevents? If yes, keep it.
Step 5: Continue the Eval Loop
Return to the built-in skill-creator's process for running test cases, grading, and iterating. After each iteration's skill revision, re-run the enforcement audit (Step 3) on the updated draft.
During the eval loop, also look for enforcement-specific signals:
- Agent skipped a step → needs an Iron Law or Gate Function
- Agent rationalized a shortcut → capture the exact excuse in a Rationalization Table
- Agent went down a wrong path → add a Red Flag + STOP
- Agent claimed completion without evidence → add Drive-Aligned Framing
- Agent stopped between tasks → add No Pause Between Tasks
These signals come from reading test run transcripts, not just final outputs.
References
- Enforcement checklist:
../../lib/references/enforcement-checklist.md— Full 12-pattern reference with templates - Philosophy:
../../PHILOSOPHY.md— Three pillars (phased decomposition, deterministic gates, adversarial review) - Built-in skill-creator: Handles the eval loop (draft → test → grade → iterate → description optimization)