Session Analytics Skill

Session Analytics

Understand where tokens go, identify waste, and optimize Claude Code sessions for cost and speed.

Goal

Give users the tools to measure, understand, and improve the efficiency of their Claude Code sessions.

Token Tracking

Reading /cost output

Run /cost in any session to see:

Input tokens:        145,230
Output tokens:        28,450
Cache read tokens:    89,100  (cheaper — 10% of input price)
Cache write tokens:   12,400  (25% more than input price)
Total estimated cost: $0.87

What each category means

| Category | What it represents | Cost relative to input | |----------|-------------------|----------------------:| | Input tokens | New content sent to the model each turn | 1.0x | | Output tokens | Content the model generates | 5.0x (Opus/Sonnet) | | Cache read | Content matched from prompt cache | 0.1x | | Cache write | Content added to prompt cache | 1.25x |

Cache reads are your best friend — they're 10x cheaper than fresh input tokens.

Token flow per turn

Each turn sends:

System prompt (~3-5k tokens, usually cached)
CLAUDE.md + rules (~2-20k tokens, usually cached after turn 1)
Conversation history (grows each turn)
Tool results from the previous turn
New user message

The conversation history is the main cost driver. It grows monotonically until /compact.

Bottleneck Identification

Signs of token waste

| Pattern | Symptom | Fix | |---------|---------|-----| | Repeated file reads | Same file in tool calls 3+ times | Read once, reference from memory | | Over-broad Bash output | ls -R or cat on large files | Use Glob/Grep with limits | | Unnecessary subagent spawning | Subagent for trivial lookup | Direct tool call instead | | Large tool output | Bash command returns 500+ lines | Pipe through head or tail | | Context thrashing | /compact then immediately re-read same files | Better anchor planning | | Wrong model tier | Opus for file search | Switch to Haiku for lookups |

Tool call cost ranking

From most to least expensive per call (typical):

Bash — unbounded output, can return huge results (10k+ tokens)
Read — proportional to file size (100-5000 tokens typical)
Agent — spawns new context (10k-100k tokens, but isolated)
Grep (content mode) — proportional to matches (100-2000 tokens)
Glob — file list only (50-500 tokens)
Edit — small diff (100-300 tokens)
Write — proportional to file size but output-only

Measuring tool efficiency

Good efficiency indicators:

Cache read ratio > 60% (most content is being cached)
Average tool output < 500 tokens per call
Files read at most twice per session
Subagents used for research, not simple lookups

Caching Behavior

How prompt caching works

Claude Code automatically caches the following between turns:

System prompt
CLAUDE.md and rules content
Conversation history up to a certain point

Cache hits occur when the same content prefix appears in consecutive turns. This means:

Turn 1: all input is fresh (cache write)
Turn 2+: system prompt and early conversation = cache read

Maximizing cache hits

Keep CLAUDE.md stable — changes invalidate the cache
Don't rearrange conversation — prefix must match exactly
Batch tool calls — multiple calls in one turn share the same cached prefix
Avoid /compact too early — compaction rewrites history, invalidating cache

Cache break points

These actions invalidate the cache:

Editing CLAUDE.md mid-session
/compact — rewrites conversation history
Tool output that changes message ordering
Switching models mid-session

Cost Estimation

Before starting a task

Estimate cost using these heuristics:

| Task Type | Model | Typical Turns | Typical Cost | |-----------|-------|-------------:|-------------:| | Quick bug fix | Sonnet | 5-10 | $0.10-0.30 | | Feature implementation | Sonnet | 15-30 | $0.50-2.00 | | Large refactor | Sonnet | 30-60 | $2.00-5.00 | | Architecture analysis | Opus | 10-20 | $3.00-8.00 | | Code review (council) | Mixed | 20-40 | $3.00-10.00 | | Research task | Haiku | 5-15 | $0.02-0.10 |

Tokens per file type

| File Type | Avg Tokens/Line | 100-Line File | |-----------|----------------:|--------------:| | TypeScript | ~10 | ~1,000 | | Python | ~8 | ~800 | | JSON | ~6 | ~600 | | Markdown | ~5 | ~500 | | YAML | ~5 | ~500 |

Cost reduction techniques

Ordered by impact:

Use Haiku for research — 18x cheaper than Opus
Use subagents — isolate expensive work from main context
Compact at 60-70% — before context becomes too expensive
Limit tool output — use head, tail, --limit on commands
Batch related work — maximize cache hits within a session
Avoid re-reading files — read once, reference from context
Use Grep over Bash grep — structured output, lower tokens
Set --max-turns for automation — cap headless sessions

Metrics to Track

For teams and repeat workflows:

| Metric | Formula | Target | |--------|---------|--------| | Cost per commit | total session cost / commits produced | < $1.00 | | Context efficiency | useful output tokens / total input tokens | > 15% | | Cache hit rate | cache read tokens / total input tokens | > 50% | | Tokens per task | total tokens / tasks completed | decreasing over time |

Session Planning

Before a complex task

Estimate scope: how many files, how complex
Choose model tier: Haiku (research), Sonnet (implementation), Opus (architecture)
Plan subagent delegation: what can be offloaded
Set budget ceiling: expected cost with 50% buffer
Plan compact points: at what context % to compact

During the session

Monitor with /cost periodically
Switch models when task complexity changes
Delegate research to Haiku subagents
Compact before context hits 80%
Use Grep and Glob before Read (cheaper discovery)

Agent Skills: Session Analytics

Install this agent skill to your local

Skill Files