Code Visualizer Skill
Purpose
Automatically generate and maintain visual code flow diagrams across multiple programming languages. The skill auto-detects which languages are present in a target path, analyzes each one with a dedicated analyzer, and emits one mermaid diagram per language plus an optional combined high-level view. It also detects when committed diagrams are stale relative to the source they describe.
What's New in 2.0.0
- Multi-language support: Python, TypeScript/JavaScript, Rust, and Go.
- Language dispatcher: Detects languages by file extension and routes to per-language analyzers.
- Language-blind renderer: A single mermaid renderer consumes a normalized graph; the renderer never inspects language semantics.
- One diagram per language plus an optional
--combinedview that places each language in its own mermaidsubgraph. - Generalized staleness: Walks all source files matching detected languages' extensions and compares max-mtime against the diagram mtime.
- Brick-style architecture: Each language analyzer is a self-contained
module that exposes a single
normalize()function. No shared inheritance.
Supported Languages
| Language | Extensions | Analyzer | Parser | Notes |
| --------------------- | -------------------------------------------- | ----------------- | ------ | ---------------------------------------------------------------- |
| Python | .py | python_analyzer | ast | Extracts import and from … import …. |
| TypeScript/JavaScript | .ts, .tsx, .js, .jsx, .mjs, .cjs | ts_analyzer | regex | Extracts import … from, require(...), dynamic import(...). |
| Rust | .rs | rust_analyzer | regex | Extracts use crate::…, use super::…, mod …. |
| Go | .go | go_analyzer | regex | Extracts single and grouped import declarations. |
Languages outside this table are skipped silently. See Extending below to add new ones.
Architecture
amplifier-bundle/skills/code-visualizer/
├── SKILL.md
├── README.md
└── scripts/
├── __init__.py
├── graph.py # Normalized data contract (Node, Edge, Graph)
├── python_analyzer.py # normalize(paths) -> Graph
├── ts_analyzer.py # normalize(paths) -> Graph
├── rust_analyzer.py # normalize(paths) -> Graph
├── go_analyzer.py # normalize(paths) -> Graph
├── dispatcher.py # detect languages, route, return dict[lang, Graph]
├── mermaid_renderer.py # render(graph) / render_combined(graphs)
├── staleness.py # is_stale(target, diagram, languages)
└── visualizer.py # CLI entry point
Data Contract (graph.py)
@dataclass(frozen=True)
class Node:
id: str # mermaid-safe identifier
label: str # human-readable label (e.g. "src/auth/oauth.py")
language: str # "python" | "typescript" | "rust" | "go"
file_path: str # absolute path on disk
@dataclass(frozen=True)
class Edge:
src: str # Node.id of source
dst: str # Node.id of destination
kind: str # "import" | "require" | "use" | "mod" | "dynamic_import"
@dataclass(frozen=True)
class Graph:
language: str
nodes: tuple[Node, ...]
edges: tuple[Edge, ...]
Analyzers may import these dataclasses but must not inherit from any shared class. The data contract is the only coupling.
Per-Language Analyzers
Each analyzer is a self-contained brick exposing exactly one entry point:
def normalize(paths: Iterable[Path]) -> Graph: ...
The function:
- Reads each file with
encoding="utf-8", errors="ignore". - Skips files larger than ~5 MB.
- Wraps parsing in
try/exceptand skips files that fail to parse. - Returns a
Graphwhoselanguagefield matches the analyzer.
Dispatcher
The dispatcher uses a registry that maps language name → extensions + module
name (string). It loads analyzers lazily via importlib.import_module so
adding a new language never requires touching the dispatcher's import
statements.
from scripts.dispatcher import analyze
graphs: dict[str, Graph] = analyze(target_path)
# {"python": Graph(...), "typescript": Graph(...)}
The dispatcher:
- Walks
target_pathwithos.walk(..., followlinks=False). - Skips
IGNORE_DIRS(.git,node_modules,.venv,venv,__pycache__,dist,build,target,.mypy_cache,.pytest_cache,.tox). - Buckets files by extension into language groups.
- Calls each language's
normalize()with its file list. - Returns a
dict[language_name, Graph]for languages that produced any files.
Mermaid Renderer
The renderer is language-blind:
from scripts.mermaid_renderer import render, render_combined
per_language: str = render(graph) # one diagram for one language
combined: str = render_combined(graphs) # one diagram, one subgraph/lang
Node IDs are sanitized ([^A-Za-z0-9_] -> _) and labels with quotes are
escaped to prevent diagram-syntax injection.
Staleness Detection
from scripts.staleness import is_stale
stale = is_stale(
target_path=Path("src/"),
diagram_path=Path("docs/architecture-python.mmd"),
languages=["python"],
)
Returns True if any source file with a matching language extension has an
mtime newer than diagram_path. Generalizes the previous Python-only
behavior.
CLI
The skill ships a single executable: scripts/visualizer.py.
python visualizer.py <path> [--output DIR] [--basename NAME]
[--check-staleness] [--combined]
| Flag | Default | Purpose |
| ------------------- | -------------- | --------------------------------------------------------------------- |
| <path> | required | Directory to analyze. Must exist and be a directory. |
| --output DIR | ./diagrams | Output directory for .mmd files. |
| --basename NAME | architecture | Filename stem. Validated against ^[A-Za-z0-9._-]+$. |
| --check-staleness | off | Print staleness report for existing diagrams; exit non-zero if stale. |
| --combined | off | Also write <basename>-combined.mmd containing all languages. |
Output Files
| File | Contents |
| --------------------------------------------- | ------------------------------------------------------ |
| <basename>-python.mmd | Mermaid diagram for Python modules and their imports. |
| <basename>-typescript.mmd | Mermaid diagram for TS/JS files and their imports. |
| <basename>-rust.mmd | Mermaid diagram for Rust modules and use edges. |
| <basename>-go.mmd | Mermaid diagram for Go packages and import edges. |
| <basename>-combined.mmd (with --combined) | One diagram with one subgraph per detected language. |
Files are only written for languages that were actually detected.
Quick Start
Generate diagrams for a polyglot repo
python amplifier-bundle/skills/code-visualizer/scripts/visualizer.py . \
--output docs/diagrams --combined
Output (for this repo, which contains Python and JS):
docs/diagrams/architecture-python.mmd
docs/diagrams/architecture-typescript.mmd
docs/diagrams/architecture-combined.mmd
Check freshness in CI
python amplifier-bundle/skills/code-visualizer/scripts/visualizer.py src/ \
--output docs/diagrams --check-staleness
# exits 1 if any per-language diagram is older than its source set
Generate for a single language
Provide a path that only contains files of one language; the dispatcher will
detect a single language and emit a single .mmd:
python visualizer.py src/auth/ # Python-only -> architecture-python.mmd
Auto-Detection Rules
- The dispatcher walks
<path>, skippingIGNORE_DIRSand symlinks. - Files are bucketed by extension into one of the supported languages.
- A language is "detected" if at least one file matches.
- Each detected language is analyzed independently.
- With
--combined, the renderer composes one mermaid diagram with onesubgraphper detected language. Cross-language edges are not inferred in the MVP.
Example Output
For a repo with:
src/api.pyimportingsrc/auth.pyweb/index.tsimportingweb/utils.ts
architecture-python.mmd:
flowchart TD
src_api_py["src/api.py"]
src_auth_py["src/auth.py"]
src_api_py --> src_auth_py
architecture-typescript.mmd:
flowchart TD
web_index_ts["web/index.ts"]
web_utils_ts["web/utils.ts"]
web_index_ts --> web_utils_ts
architecture-combined.mmd:
flowchart TD
subgraph python ["python"]
src_api_py["src/api.py"]
src_auth_py["src/auth.py"]
src_api_py --> src_auth_py
end
subgraph typescript ["typescript"]
web_index_ts["web/index.ts"]
web_utils_ts["web/utils.ts"]
web_index_ts --> web_utils_ts
end
Note: the renderer emits the
subgraph <id> ["<label>"]form (space between id and bracketed label), which is the Mermaid-documented syntax accepted across recent Mermaid versions.test_mermaid_renderer.pypins the exact emitted form.
Extending: Adding a New Language
The skill follows the brick philosophy: a new language is a new self-contained module. There is no base class to subclass.
-
Create
scripts/<lang>_analyzer.pywith the entry point:from collections.abc import Iterable from pathlib import Path from graph import Edge, Graph, Node # sibling import; works under `python visualizer.py` def normalize(paths: Iterable[Path]) -> Graph: nodes: list[Node] = [] edges: list[Edge] = [] for p in paths: # parse file, append nodes/edges ... return Graph(language="<lang>", nodes=tuple(nodes), edges=tuple(edges)) -
Register the language in
scripts/dispatcher.py:LANGUAGES = { "python": {"exts": {".py"}, "module": "python_analyzer"}, "typescript": {"exts": {".ts", ".tsx", ".js", ".jsx", ".mjs", ".cjs"}, "module": "ts_analyzer"}, "rust": {"exts": {".rs"}, "module": "rust_analyzer"}, "go": {"exts": {".go"}, "module": "go_analyzer"}, # add here: "<lang>": {"exts": {".ext"}, "module": "<lang>_analyzer"}, } -
Add
tests/test_<lang>_analyzer.pywithtmp_pathfixtures asserting nodes and edges produced by representative source snippets. -
Update the Supported Languages table above.
That's it. The renderer, dispatcher routing, staleness detector, and CLI all
work without further changes because they consume the language-blind Graph
data contract.
Testing
Tests live under amplifier-bundle/skills/code-visualizer/tests/ and run via
pytest. The skill registers its tests/ directory in the repo's
pytest.ini testpaths so CI picks them up automatically.
Test files:
| File | Purpose |
| -------------------------- | -------------------------------------------------------------------- |
| test_python_analyzer.py | AST-driven import extraction; verifies edges for import/from. |
| test_ts_analyzer.py | import/require/dynamic import(); type-only and relative paths. |
| test_dispatcher.py | Mixed-language fixture; verifies correct routing per extension. |
| test_mermaid_renderer.py | Empty graphs, non-empty graphs, ID/label sanitization. |
| test_staleness.py | Mtime comparison across multiple language extensions. |
| test_smoke_repo.py | Runs dispatcher against the repo root; asserts non-empty mermaid |
| | for both Python and TypeScript/JavaScript. |
Run only the skill's tests:
pytest amplifier-bundle/skills/code-visualizer/tests -q
Security Considerations
- No code execution: Analyzers only parse source. No
exec/eval/ subprocess on analyzed files. - Path validation:
<path>and--outputare resolved withPath.resolve()and rejected if non-existent or non-directory. - Filename validation:
--basenamemust match^[A-Za-z0-9._-]+$. - Symlink safety:
os.walk(..., followlinks=False)plusIGNORE_DIRSprevents loops and escape. - Bounded reads: Per-file size cap (~5 MB); UTF-8 decode with
errors="ignore". - Bounded regex: Anchored, no nested quantifiers; protects against ReDoS.
- Mermaid sanitization: Node IDs strip non-
[A-Za-z0-9_]; labels with embedded quotes are escaped. - Stdlib-only: Zero third-party runtime dependencies; no supply-chain surface.
- Output containment: Writes are constrained to the resolved
--outputdirectory; source content is never logged.
Limitations
- Static heuristics: Regex-based extraction for TS/JS/Rust/Go misses some
edge syntax (TS type-only imports across multiple lines, Rust nested
use {a, b::c}, Go cgo blocks). Documented per analyzer in source. - No call graphs: Edges are import/use only. Runtime/dynamic imports
beyond
import("...")/__import__are not modeled. - External imports: Rendered as ghost target nodes inline; not resolved to real files.
- Combined view: Cross-language edges are out of MVP scope.
- Shell scripts: Not first-class;
.shfiles are ignored. - Compiler-grade accuracy: Not a goal. The skill optimizes for "useful diagram in seconds" over "perfect AST."
Philosophy Alignment
| Principle | How v2.0 follows it |
| ----------------------- | ------------------------------------------------------------------------------------ |
| Ruthless Simplicity | Stdlib-only; regex over tree-sitter; max-mtime over semantic diff. |
| Zero-BS | Real parsers (ast for Python, regex for others). Limitations documented honestly. |
| Modular Design | Each analyzer is a brick with a single normalize() stud. No inheritance. |
| Brick Composition | Renderer/dispatcher/staleness are independent bricks reusing only the data contract. |
Migration from 1.x
The 1.x skill was Python-only. Forward-compatibility notes (verify against your actual 1.x integration before relying on them):
- Diagrams previously named
<basename>.mmdare now<basename>-python.mmd. Update any references inREADME.md/ARCHITECTURE.md. - Staleness reports now include a per-language breakdown. CI scripts that parsed the old single-line output should be updated to handle multiple languages.
- Any direct Python helper used in 1.x is superseded by
dispatcher.analyze(path)returning adict[language, Graph]. Callers that only want Python can usedispatcher.analyze(path)["python"].
Remember
The skill automates what developers forget across all four supported languages: keeping diagrams in sync with code. It's not a compiler; it's a fast, honest, multi-language snapshot.