Documentation Maintenance
Systematically audit, organize, and remediate project documentation by comparing the codebase against existing docs to find staleness, gaps, and misorganization.
When to Trigger
- After merging a feature branch or completing a refactor
- After dependency upgrades or API changes
- When onboarding surfaces confusion about project docs
- Periodic maintenance (monthly or per-release)
- When
scripts/doc_audit.pyis run manually and reports findings
Workflow Overview
Phase 1: Audit → Run deterministic scan + haiku search agents
Phase 2: Triage → Classify findings by severity and action type
Phase 3: Remediate → Dispatch specialized agents to fix/create docs
Phase 4: Quality → docs-architect reviews all changes
Phase 1: Audit
Step 1a — Run the deterministic scan
Execute the bundled audit script to get a baseline report:
python3 skills/doc-maintenance/scripts/doc_audit.py
The script produces a structured report covering:
- Broken internal links (markdown
[text](path)pointing to missing files) - Orphan docs (files not linked from any other doc or README)
- Missing required structure (expected folders/files absent from
docs/ormanual/) - Stale-relative-to-code (docs whose referenced subtree has churned significantly since the doc was last touched — measured via git log of code commits between the doc's last edit and now, not absolute doc age, which produces noise on stable architecture docs)
- Empty or stub files (< 3 lines of content)
Pass --json for machine-readable output. Pass --root PATH to override project root detection.
Step 1b — Dispatch search agents
After the deterministic scan, launch subagents to perform deeper analysis.
Model selection is task-calibrated: haiku for pattern enumeration, sonnet for
multi-file correlation and judgment. See references/agent-dispatch.md for
full prompt templates.
Agent 1 — Code-to-doc coverage scan (subagent_type: "Explore", model: "haiku"):
Search the codebase for public APIs, CLI commands, config schemas, and exported modules.
Cross-reference against existing docs (presence-check via grep). Report anything undocumented.
Haiku is sufficient because the inner loop is pattern enumeration plus a presence grep.
Agent 2 — Doc-to-code freshness scan (subagent_type: "general-purpose",
model: "sonnet", per-docfile dispatch):
For each markdown file in scope, dispatch a sonnet agent with the file content. The
agent (a) extracts every concrete code reference (function names, CLI flags, file paths,
config keys, API endpoints), (b) verifies each against current code via grep / codanna /
Read, (c) reports mismatches as RENAMED / REMOVED / CHANGED. Sonnet + per-docfile is
needed because freshness verification often requires multi-file traces (handler →
middleware → config) and distinguishing renamed from removed requires reading the new
location, not just confirming the old one is gone.
Agent 3 — Structure compliance scan (subagent_type: "Explore", model: "haiku"):
Compare current docs/ and manual/ layout against the prescribed folder structure
in references/folder-structure.md. Report missing folders, misplaced files, naming violations.
Haiku is correct here — pure pattern matching against a spec.
Agent 4a — ASCII diagram detector (subagent_type: "Explore", model: "haiku"):
Scan markdown for box-drawing characters (─ │ ┌ ┐ └ ┘ ├ ┤ ┬ ┴ ┼), arrow notation
(-->, <--, ==>), or indented tree structures beyond a few nodes. Report each as
a candidate for Mermaid conversion with the suggested diagram type. Haiku is correct
because this is mechanical pattern detection — find the characters, report the location.
Agent 4b — Missing-diagram judgment scan (subagent_type: "general-purpose",
model: "sonnet", per-docfile dispatch):
For each markdown file in scope, dispatch a sonnet agent. The agent reads the doc and
flags sections where adding a diagram would meaningfully improve comprehension —
multi-step flows where ordering matters, architecture relationships with named
components, state transitions, request/response sequences. Crucially, sonnet flags
only when a diagram adds value over the prose (not every step list needs a diagram).
Haiku tends to either over-flag (every list looks diagrammable) or under-flag (misses
implicit flow descriptions); sonnet's judgment is what makes this signal trustworthy.
Launch agents 1, 2, 3, 4a, 4b in parallel — but agents 2 and 4b dispatch internally
per-docfile, so the actual call count is 2 + (2 × N_docfiles) rather than 5.
Step 1c — Merge results
Combine the script output with agent findings into a single audit report. Deduplicate overlapping findings. The report becomes the input for Phase 2.
Phase 2: Triage
Classify each finding into one of these action categories:
| Category | Description | Example |
|----------|-------------|---------|
| stale | Doc exists but references outdated code/behavior | CLI flag renamed but docs show old name |
| missing | No doc exists for a documented-worthy item | Public API endpoint with no reference doc |
| orphan | Doc exists but is unreachable / unlinked | Guide file not in any index or nav |
| misplaced | Doc exists but is in the wrong folder | Tutorial sitting in docs/architecture/ |
| irrelevant | Doc covers removed functionality | Guide for a deleted feature |
| structural | Folder structure deviates from prescribed layout | Missing docs/security/ folder |
| diagram-convert | ASCII/text diagram should be Mermaid | Complex box-drawing flowchart in architecture doc |
| diagram-missing | Section would benefit from a diagram | Multi-step process described only in prose |
Assign severity:
- P0 — User-facing doc is factually wrong (manual/)
- P1 — Developer doc references nonexistent code
- P2 — Missing doc for public API or feature
- P3 — Structural / organizational issues
- P4 — Minor staleness, cosmetic
Phase 3: Remediate
Route each finding to the appropriate specialist agent. Use the Task tool with
the subagent types listed below. See references/agent-dispatch.md for detailed
prompt templates.
| Doc type | Subagent type | Target location |
|----------|---------------|-----------------|
| API reference docs | reference-builder | docs/reference/ or docs/api/ |
| Architecture docs | technical-writer | docs/architecture/ |
| Developer guides (style, local dev, workflows) | technical-writer | docs/development/ |
| Testing docs | technical-writer | docs/testing/ |
| Security docs | technical-writer | docs/security/ |
| User-facing tutorials | learning-guide | manual/tutorials/ |
| User-facing how-to guides | learning-guide | manual/guides/ |
| User-facing getting started | learning-guide | manual/getting-started/ |
| Plans and proposals | technical-writer | docs/plans/ |
| ASCII diagram conversion | mermaid-expert | Inline in existing doc |
| New diagrams for prose sections | mermaid-expert | Inline in existing doc |
Parallel dispatch: Group independent remediation tasks and dispatch them simultaneously. Only serialize when one doc depends on another (e.g., an API reference needed before a tutorial that links to it). Dispatch up to 4 remediation agents in parallel per batch.
For updates to existing docs: Provide the agent with the current file contents and the specific finding to fix. Instruct it to make minimal, targeted edits.
For new docs: Provide the agent with the relevant source code, the target file path, and the folder-structure spec so it follows naming conventions.
Phase 4: Quality Gate
After all remediation agents complete, dispatch a single docs-architect agent to review
the full set of changes. The quality gate checks:
- Accuracy — Do docs match current code?
- Completeness — Are all public interfaces covered?
- Organization — Does folder structure match the prescribed layout?
- Cross-references — Are all internal links valid?
- Consistency — Tone, formatting, heading levels
- No orphans — Every new doc is linked from an index or parent doc
If the quality gate fails, loop back to Phase 3 for the specific issues flagged. Maximum 2 remediation loops before escalating to the user.
Folder Structure
The prescribed folder layout is defined in references/folder-structure.md. Summary:
docs/ — Internal / developer documentation
docs/
├── architecture/ — System design, ADRs, component diagrams
├── development/ — Developer guides: style, local setup, issue tracking
├── plans/ — Proposals, RFCs, roadmaps
├── reviews/ — Code review records, audit reports
├── testing/ — Test strategy, coverage reports, test plans
├── reports/ — Generated reports, metrics, analysis
├── security/ — Security policies, threat models, audit findings
├── api/ — Internal API docs (OpenAPI specs, gRPC protos)
├── reference/ — CLI reference, config reference, manpages
├── ideas/ — Exploratory notes, spikes, brainstorms
└── archive/ — Deprecated docs preserved for history
manual/ — User-facing documentation (project root)
manual/
├── getting-started/ — Installation, quickstart, first steps
├── guides/ — How-to guides for common tasks
├── tutorials/ — Step-by-step learning paths
├── reference/ — User-facing command/config reference
└── troubleshooting/ — FAQ, common errors, known issues
README.md — Project root
The main README is audited for accuracy but not reorganized. Findings about the README are reported as stale/missing items for manual remediation.
Anti-Patterns
- Do not delete docs without confirming the feature they describe is truly removed
- Do not reorganize docs without updating all internal cross-references
- Do not create stub files just to fill the folder structure — only create docs with real content
- Do not duplicate content between
docs/andmanual/— link instead - Do not move user-facing docs into
docs/or developer docs intomanual/