Agent Skills: Agent Skill Creator Standard

Standards for creating, testing, and optimizing Agent Skills for any AI Agent (Claude, Cursor, Windsurf, Copilot). Use when: writing SKILL.md, auditing a skill, improving trigger accuracy, checking size limits, structuring references/, writing anti-patterns, starting a new skill from scratch, or reviewing skill quality.

UncategorizedID: hoangnguyen0403/agent-skills-standard/common-skill-creator

Install this agent skill to your local

pnpm dlx add-skill https://github.com/HoangNguyen0403/agent-skills-standard/tree/HEAD/.codex/skills/common/common-skill-creator

Skill Files

Browse the full folder contents for common-skill-creator.

Download Skill

Loading file tree…

.codex/skills/common/common-skill-creator/SKILL.md

Skill Metadata

Name
common-skill-creator
Description
'Standardizes the creation and evaluation of high-density Agent Skills (Claude, Cursor, Windsurf). Ensures skills achieve high Activation (specificity/completeness) and Implementation (conciseness/actionability) scores. Use when: writing or auditing SKILL.md, improving trigger accuracy, or refactoring skills to reduce redundancy and maximize token ROI.'

Agent Skill Creator Standard

Priority: P0 — Apply to ALL skills

Maximize Token ROI. Every line in SKILL.md must provide specific procedural value. Activation (how it triggers) and Implementation (how it helps) primary quality metrics.

Three-Level Loading System

  • Level 1 Frontmatter: name + description (Activation Anchor), ≤100 words.
  • Level 2 SKILL.md body: Core Rules + Workflows (Implementation Core), ≤100 lines.
  • Level 3 references/: Detailed examples, schemas, and "TESTS.md" (On-demand).

Workflow (New or Existing Skill)

New skill:

  1. Research — web-search domain best practices, checklists, and standards; extract key terms → triggers, workflows → guidelines, mistakes → anti-patterns. See Web Search Research.
  2. Capture intent — what it , when it trigger, expected output format?
  3. Write SKILL.md — draft using TEMPLATE.md
  4. Test — spawn parallel subagents: one with-skill, one without-skill (baseline)
  5. Evaluate — grade assertions, review benchmark (pass rate, tokens, time)
  6. Iterate — rewrite based on feedback, rerun into next iteration dir, repeat
  7. Optimize description — run trigger eval queries, target ≥80% accuracy
  8. Pressure-test — for discipline skills, capture agent excuses, red flags, and stop conditions Existing skill:
  9. Audit — run Quality Checklist below; identify violations
  10. Snapshotcp -r <skill-dir> <workspace>/skill-snapshot/ before any edits
  11. Improve SKILL.md — fix violations, compress, move oversized content to references/
  12. Test — spawn parallel subagents: one with-new-skill, one with-snapshot (baseline)
  13. Evaluate & iterate — same as steps 4–5 above
  14. Optimize description — re-run trigger eval if description changed
  15. Harden — add rationalization counters where agents still fail under pressure See Eval Workflow for full testing + iteration details.

Description Quality (Activation)

  • Third-Person Voice: Use Standardizes..., Audits..., Encrypts.... Avoid "I will" or "This skill helps to".
  • What + When Structure:
  • What: Define 5–8 specific capabilities (e.g., "Generates JWT tokens, rotates keys").
  • When: Explicitly define triggers (e.g., "Use when user says 'rotate keys'").
  • Specificity: Avoid vague verbs like "manage" or "handle". Use "Validate", "Inject", "Refactor", "Sanitize".
  • Trigger Hint: Include (triggers: *.ext, keyword) suffix for technical skills.

Content Quality (Implementation)

  • No Redundant Knowledge: NOT explain concepts AI already knows (e.g., HTTP status codes, standard library docs, basic SOLID principles). Focus strictly on project-specific rules.
  • Caveman Compression: Use "Caveman Mode" for rules to save tokens. Drop articles (, , ), remove filler words ("should", "will", "), and use telegraphic snippets.
  • Standard: "You should ensure that database connection closed after every query to prevent leaks." (15 tokens)
  • Caveman: "Close DB connection after query. Prevent leaks." (7 tokens)
  • Actionability: Examples must copy-paste ready and executable.
  • Workflow Clarity: Use sequential ordered lists for multi-step processes.
  • Progressive Disclosure: Move code blocks >10 lines to references/.
  • Pressure Hardening: Discipline skills must name red flags, common excuses, and exact stop/restart conditions.

Behavior Guardrails

  • Use pressure tests for discipline skills: TDD, debugging, verification, review, protocol, and workflow skills need baseline failure evidence.
  • Capture rationalizations: Save the exact excuses agents use when they skip the rule.
  • Add red flags: Short phrases that tell the agent to stop and restart the protocol.
  • Encode behavior in evals: Add pressure_scenarios, rationalizations, red_flags, and behavior_assertions when the skill is guardrail-oriented.
  • Keep it local: Put behavior details in evals or references/; keep SKILL.md compact.

Anti-Patterns

  • No "AI-splaining": not explain why pattern good unless it's unique project constraint.
  • No Vague Triggers: Never use src/** or **/*. surgical.
  • No Description Bloat: If description exceeds 100 words, some capabilities belong in body.
  • No long code blocks: >10 lines → extract to references/
  • No redundancy: don't repeat frontmatter content in body
  • No untested guardrails: Rules that were never pressure-tested are speculation.

Quality Checklist (Tessl-Aligned)

  • [ ] Activation ≥ 90%: Description covers both capabilities ("What") and triggers ("When").
  • [ ] Implementation ≥ 90%: No general-purpose explanations; all examples executable.
  • [ ] Structural Compliance: SKILL.md ≤ 100 lines; code blocks moved to references/.
  • [ ] Trigger rate ≥80% on should-trigger queries.
  • [ ] Guardrail skills include rationalizations, red flags, and behavior eval fields.

References

Agent Skill Creator Standard Skill | Agent Skills