Prompt Engineering Standards
<metadata>- Load if: Writing AI prompts, optimizing context usage
- Prerequisites: @smith-principles/SKILL.md
CRITICAL: Prompt Caching (Primacy Zone)
<required>Cache reduces costs 90%, latency 85%
Structure for caching:
- Static content first (methodology, rules)
- Tool definitions in consistent order
- Project context (AGENTS.md, docs)
- Dynamic content last (recent changes)
Cache breakpoints: Every ~1024 tokens. Prefix must be identical for cache hit.
</required> <forbidden>- Reordering tools between calls
- Injecting dynamic content into static sections
- Modifying cached prefix unnecessarily
- Using Markdown tables (see
@smith-skills/SKILL.md- use bullet lists instead)
AGENTS.md Cache-Friendly Structure
<!-- STATIC - cached -->
<metadata>
Scope, Load if, Prerequisites
</metadata>
<required>
Critical NEVER/ALWAYS rules
</required>
<forbidden>
Anti-patterns
</forbidden>
<!-- CACHE BREAKPOINT (~1024 tokens) -->
<!-- DYNAMIC - not cached -->
<examples>
Code examples that evolve
</examples>
Token Efficiency
Progressive Disclosure
Three-level loading:
- Metadata only (50 tokens)
- Core concepts when triggered (200 tokens)
- Full details when accessed (1000+ tokens)
Sparse Attention
<required>Efficient file reading:
- Grep to find location
- Read with offset/limit for large files
- Read only necessary context (±20 lines)
- Loading full files when targeted reads suffice
- Reading documentation when metadata answers the question
- Repeating user's question in responses
Structured Output
Platform mechanisms:
- OpenAI: JSON Schema with
strict: true(100% compliance) - Anthropic: Tool use with flexible schemas
- Gemini: responseSchema with retry
Schema design:
- Match existing project patterns
- Include descriptions for complex fields
- Define required vs optional fields
- Keep nesting ≤3 levels
- @smith-ctx/SKILL.md - Progressive disclosure, reference-based communication
@smith-xml/SKILL.md- Approved XML tags
ACTION (Recency Zone)
<required>For caching:
- Place static content before dynamic
- Maintain consistent tool order
- Target >80% cache hit rate
For efficiency:
- Use Grep before Read
- Read incrementally (narrow → expand)
- Use file:line references