Context Engineering
Unified skill for all context engineering patterns in AI agent systems. This skill consolidates context management, compression, degradation detection, and KV-cache optimization into a single entry point with progressive disclosure.
Quick Decision Matrix
| Problem | Solution | Reference | |:--------|:---------|:----------| | Context window filling up | Compression | compression.md | | Agent ignoring mid-context info | Degradation Detection | degradation.md | | High API costs | KV-Cache Optimization | kv-cache.md | | Session state persistence | Session Management | session-management.md |
Core Concepts
Context Window Thresholds
| Utilization | Action | Technique | |:------------|:-------|:----------| | <60% | Monitor | No action needed | | 60-80% | Light compression | Observation masking | | 80-95% | Aggressive compression | Summarization + compaction | | >95% | Emergency | Force session handoff |
The Four-Bucket Framework
- Write: Save non-critical info outside context (scratchpads, files)
- Select: Pull only relevant context (high-precision retrieval)
- Compress: Reduce while preserving information
- Isolate: Separate contexts across sub-agents
Compression Techniques Summary
| Technique | Token Overhead | Reduction | Best For | |:----------|:--------------|:----------|:---------| | Observation Masking | 0% | 90-98% | Tool outputs >200 tokens | | Summarization | 5-7% | 60-90% | Mixed content | | Compaction | 0% | 50-80% | Older messages |
Quick Pattern - Observation Masking:
Before: 500 lines of tool output (500 tokens)
After: "See /results/search_20260101.txt" (12 tokens)
Degradation Patterns Summary
| Pattern | Symptom | Mitigation | |:--------|:--------|:-----------| | Lost-in-Middle | Info at 40-60% position ignored | Place critical info at start/end | | Context Poisoning | Errors compound through references | Require source citations | | Context Distraction | Model ignores training knowledge | Quality over quantity | | Context Confusion | Incorrect associations | Rigorous context selection | | Context Clash | Contradictory information | Establish information hierarchy |
KV-Cache Optimization Summary
The Four Principles:
- Stable Prefix: Never change system prompts across requests
- Append-Only: Never modify previous messages
- Deterministic Serialization: Same data = same tokens (sort JSON keys)
- Explicit Breakpoints: Mark cache boundaries
Session Management Summary
Directory Structure:
.cattoolkit/
├── context/
│ ├── scratchpad.md # Current thinking/decisions
│ ├── todos.md # Persistent task tracking
│ ├── context.log # Session history
│ └── checkpoints/ # State snapshots
Scratchpad Hygiene Rule: Only update scratchpad for:
- Critical decisions made
- Errors encountered
- Phase changes
- Progress milestones
Attention Manipulation via TodoWrite (Proactive Tracking)
The recitation technique from Manus/Claude Code pushes objectives into recent attention span to prevent "lost-in-the-middle" issues:
The Pattern:
- Create todo.md at task start
- Update continuously - Check off completed items, add new ones
- Recite objectives - Rewrite todo to push global plan into model's recent attention
Why It Works:
- Constant todo rewriting recites objectives into context end
- Avoids "lost-in-the-middle" issues without architectural changes
Implementation:
# Before task
- [ ] Research codebase structure
- [ ] Identify patterns
- [ ] Plan implementation
# After research
- [x] Research codebase structure
- [ ] Identify patterns ← Still visible in recent attention
- [ ] Plan implementation
Best Practice: Update todos after every major tool call to maintain objective visibility.
System Reminders Integration
System reminders combat context degradation through recurring objective injection:
Locations:
- User messages - System reminders in prompt
- Tool results - Runtime injections
- Code execution - Added via scripts
Usage Pattern:
# Add reminder at critical points
echo "Reminder: Focus on authentication edge cases" >> .claude/reminders.txt
Effective Reminders:
- Objective recitation - Reiterate main goal
- Constraint reinforcement - Re-emphasize critical requirements
- Context anchoring - Reference key context elements
Plan Mode Best Practices
Plan mode uses recurring prompts to remind the agent:
Implementation:
- Creates markdown files (PLAN.md) persisted during compaction
- Stored in
.cattoolkit/context/ - Accessible via
/plancommand - Multiple plan prompts and tool schemas for lifecycle
When to Use:
- Complex tasks requiring 10+ tool calls
- Multi-phase implementations
- When agent appears confused or drifting
- Long-running workflows
Best Practices:
- Create plan at task start
- Update as understanding evolves
- Reference plan in reminders
- Use as context anchor during compaction
Integration Points
| Skill | Integration | |:------|:------------| | memory-systems | Long-term memory complements context | | agent-orchestration | Each agent manages own context | | planning-with-files | Plans stored outside context |
Usage
When invoked, this skill will:
- Assess current context state
- Identify appropriate technique
- Apply optimization
- Generate metrics report
For detailed implementation, see references/ subdirectory.