Skill Retro
Analyze how skills performed in the current session and apply improvements via skill-creator.
Process
Step 1: Preprocess Session
Run the preprocessing script to extract a clean transcript from the current session's JSONL log:
node ${CLAUDE_SKILL_DIR}/scripts/preprocess.mjs --cwd <current-working-directory>
Capture the JSON output. Key fields:
transcript_file— path to the temp file containing the full transcript (at~/.claude/tmp/skill-retro-<session-id>.json)stats.skill_invocations— count of skills used
The script writes the full transcript to a temp file to avoid flooding the main thread's context window. Only the summary (stats + file path) appears in stdout.
If the script fails (e.g., no session files found), report the error and stop.
If skill_invocations is 0, tell the user: "No skills were invoked in this session — nothing to analyze." and stop.
Step 2: Analyze (Sub-Agent)
Spawn an analysis sub-agent (using the Agent tool, model: sonnet):
- Load the prompt from
${CLAUDE_SKILL_DIR}/references/analysis-prompt.md - Tell the agent to read the transcript file at the path from
transcript_filein Step 1's output - The agent should read the SKILL.md file for each invoked skill to compare intended vs. actual behavior. Skill paths are discoverable from the transcript's system-reminder blocks (which list installed skill base directories) or from
[SKILL INVOCATION]markers combined with installed plugin cache paths. - The agent returns a JSON object with
findings,well_executed, andseverity_summary
Parse the agent's response into structured findings.
Once the analysis agent has finished, delete the temp file created in Step 1:
rm <transcript_file>
If the agent returns 0 findings, tell the user: "All skills performed well this session. No improvements identified." Show the well_executed list and stop.
Step 3: Interpret and Recommend
Don't just present raw findings — interpret them. For each finding, add your own reasoning:
- What the fix looks like — Is it a one-line description tweak? A new section in SKILL.md? A script rewrite? Be specific about what changes would actually be made.
- Effort estimate — Quick fix (description/wording change), moderate (new section or error handling), or high-lift (architectural change, script rewrite, new scripts).
- Recommendation — Whether to action it now, defer it, or skip it entirely. Explain why.
Group findings by skill and present with your interpretation:
## Skill Performance Report
### scheduler:manage (3 findings)
**1. [high] gap_coverage — pause command doesn't verify unload succeeded**
The `_launchctl_unload` helper silently swallows errors, so the registry can say "paused" while the job is still running.
- **Fix:** Add return-code checking to `_launchctl_unload` in the manage script, and add a post-unload verification step in SKILL.md's Pause operation.
- **Effort:** Moderate — script change + SKILL.md update
- **Recommend: Action** — This caused real user confusion in this session.
**2. [medium] gap_coverage — No error recovery entry for "task still runs after pausing"**
The assistant had to improvise the entire investigation.
- **Fix:** Add a new row to the Error Recovery table in SKILL.md.
- **Effort:** Quick — one table row addition
- **Recommend: Action** — Low effort, high value.
**3. [low] execution_quality — list command shows registry status without cross-checking launchd**
Would require the list command to shell out to `launchctl list` and reconcile.
- **Fix:** Add launchd state reconciliation to the list operation in the manage script.
- **Effort:** High-lift — requires new script logic and testing across platforms.
- **Recommend: Defer** — Nice-to-have but significant work. Not worth addressing now.
### Well Executed
- list presented clean, accurate registry data
- Assistant correctly went outside the skill to diagnose the real issue
---
I recommend actioning findings 1 and 2. Finding 3 is high-lift and better deferred.
Proceed with these? (y/adjust/none)
Key principles:
- Lead with your recommendation — don't make the user figure out what's worth doing
- Be honest about high-lift items — don't propose changes that would require major rewrites unless they're clearly worth it
- Group related findings that would be addressed together
- The user can adjust your recommendation (add/remove items) or accept it
Wait for user confirmation. If "none", show summary and stop.
Step 4: Resolve Source Locations
For each skill with selected findings, determine where to edit:
Resolution order:
-
Project-level skill — path contains
.claude/skills/relative to a project → Candidate: edit in place (user owns it) -
User-level skill — path is under
~/.claude/skills/→ Candidate: edit in place (user owns it) -
Installed plugin — path is under
~/.claude/plugins/cache/→ Try to trace to source: a. Extract plugin name from the path b. Check if cwd is a marketplace project containing that plugin's source (look for<plugin-name>/skills/<skill-name>/SKILL.md) c. If not found, ask user: "Where is the source code for the<plugin-name>plugin? Provide a path, or type 'installed' to edit the installed copy." → If "installed": warn that changes will be overwritten on plugin update, proceed only if user confirms → If user provides a path: verify it exists and contains the skill -
Current project contains source — cwd has
./<plugin-name>/skills/<skill-name>/SKILL.md→ Candidate: edit in place
For each resolved path, confirm with the user before proceeding:
I'll make changes to <skill-name> at:
→ /path/to/resolved/source/SKILL.md
Is this the right location? (y/n or provide correct path)
Do NOT proceed to implementation until every path is confirmed.
Step 5: Implement Improvements (Parallel Sub-Agents)
For each affected skill (with confirmed source path), spawn an implementation sub-agent:
-
One agent per skill — can run in parallel since they edit different files
-
Agent prompt:
"You are improving a Claude Code skill based on performance analysis findings.
Skill:
<skill-name>Source path:<confirmed-path-to-skill-directory>SKILL.md location:<confirmed-path>/SKILL.mdFindings to address: <list all selected findings for this skill with full observation, evidence, and proposed_improvement>
Instructions:
- Read the current SKILL.md at the source path
- Invoke the
skill-creator:skill-creatorskill - When skill-creator asks what you want to do, explain you are improving an existing skill based on session analysis
- Provide the findings as the basis for changes
- Follow skill-creator's process to apply the improvements
- After changes are made, report what was modified"
Step 6: Summary
After all implementation agents complete, present a summary:
## Skill Retro Complete
### Changes Applied
| Skill | Source Path | Changes |
|-------|------------|---------|
| scheduler:manage | /path/to/SKILL.md | Added return-code checking to _launchctl_unload, post-unload verification step |
| scheduler:manage | /path/to/SKILL.md | Added "task still runs after pausing" to Error Recovery table |
### Deferred
- scheduler:manage #3 (launchd state reconciliation) — high-lift, deferred
### Well Executed
- list operation presented clean, accurate registry data
Important Notes
- This skill is designed to run late in sessions when context may be full. All analysis and implementation happens in sub-agents to preserve main thread context.
- The preprocessing script writes the full transcript to
~/.claude/tmp/skill-retro-*.jsonand only outputs a summary to stdout. Clean up the temp file after the analysis sub-agent has finished reading it. - The preprocessing script has zero dependencies beyond Node.js.
- Never edit a skill without user confirmation of the source path.
- When invoking skill-creator in implementation agents, let it guide the process — don't bypass its workflow.