Agent Skills: Literate Programming Skill

CRITICAL: ALWAYS activate this skill BEFORE making ANY changes to .nw files. Use proactively when: (1) creating, editing, reviewing, or improving any .nw file, (2) planning to add/modify functionality in files with .nw extension, (3) user asks about literate quality, (4) user mentions noweb, literate programming, tangling, or weaving, (5) working in directories containing .nw files, (6) creating new modules/files that will be .nw format. Trigger phrases: 'create module', 'add feature', 'update', 'modify', 'fix' + any .nw file. Never edit .nw files directly without first activating this skill to ensure literate programming principles are applied. (project, gitignored)

developmentID: dbosk/claude-skills/literate-programming

Install this agent skill to your local

pnpm dlx add-skill https://github.com/dbosk/claude-skills/tree/HEAD/literate-programming

Skill Files

Browse the full folder contents for literate-programming.

Download Skill

Loading file tree…

literate-programming/SKILL.md

Skill Metadata

Name
literate-programming
Description
"CRITICAL: ALWAYS activate this skill BEFORE making ANY changes to .nw files. Use proactively when: (1) creating, editing, reviewing, or improving any .nw file, (2) planning to add/modify functionality in files with .nw extension, (3) user asks about literate quality, (4) user mentions noweb, literate programming, tangling, or weaving, (5) working in directories containing .nw files, (6) creating new modules/files that will be .nw format. Trigger phrases: 'create module', 'add feature', 'update', 'modify', 'fix' + any .nw file. Never edit .nw files directly without first activating this skill to ensure literate programming principles are applied. (project, gitignored)"

Literate Programming Skill

CRITICAL: This skill MUST be activated BEFORE making any changes to .nw files!

You are an expert in literate programming using the noweb system.

Reference Files

This skill includes detailed references in references/:

| File | Content | Search patterns | |------|---------|-----------------| | noweb-commands.md | Tangling, weaving, flags, troubleshooting | notangle, noweave, -R, -L | | testing-patterns.md | Test organization, placement, dependency testing | test functions, pytest, after implementation | | git-workflow.md | Version control, .gitignore, pre-commit | git, commit, generated files | | multi-directory-projects.md | Large project organization, makefiles | src/, doc/, tests/, MODULES | | project-initialization.md | New project setup, templates, checklist | new project, initialize, pyproject.toml | | preamble.tex | Standard LaTeX preamble for documentation | \usepackage, memoir |

When to Use This Skill

Correct Workflow

  1. User asks to modify a .nw file
  2. YOU ACTIVATE THIS SKILL IMMEDIATELY
  3. You plan the changes with literate programming principles
  4. You make the changes following the principles
  5. You regenerate code with make/notangle

Anti-pattern (NEVER do this)

  1. User asks to modify a .nw file
  2. You directly edit the .nw file ← WRONG
  3. Later review finds literate quality problems
  4. You have to redo everything

Remember

  • .nw files are NOT regular source code files
  • They combine documentation and code for human readers
  • Literate quality is AS IMPORTANT as code correctness
  • Bad literate quality = failed task, even if code works

Planning Changes

When making changes to a .nw file:

  1. Read the existing file to understand structure and narrative
  2. Plan with literate programming in mind:
    • What is the "why" behind this change?
    • How does this fit into the existing narrative?
    • What new chunks are needed? What are their meaningful names?
    • Where in the pedagogical order should this be explained?
  3. Design documentation BEFORE writing code:
    • Write prose explaining the problem and solution
    • Use subsections to structure complex explanations
  4. Decompose code into well-named chunks:
    • Each chunk = one coherent concept
    • Names describe purpose, not syntax (like pseudocode)
  5. Write the code chunks
  6. Regenerate and test

Key principle: If you find yourself writing code comments to explain logic, that explanation belongs in the documentation chunks instead.

Reviewing Literate Programs

When reviewing, evaluate:

  1. Narrative flow: Coherent story? Pedagogical order?
  2. Variation theory: Contrasts used? "Whole, parts, whole" structure?
  3. Chunk quality: Meaningful names? Focused on single concepts?
  4. Explanation quality: Explains "why" not just "what"? Red flags: prose that begins "We [verb] the [noun]" matching a function name; prose that describes parameter types visible in the signature; prose that restates conditionals without explaining why they matter.
  5. Test organization: Tests after implementation, not before?
  6. Proper noweb syntax: [[code]] notation? Valid chunk references?

Core Philosophy

Literate programming (Knuth) has two goals:

  1. Explain to human beings what we want a computer to do
  2. Present concepts in order best for human understanding (psychological order, not compiler order)

Variation Theory

Apply variation-theory skill when structuring explanations:

  • Contrast: Show what something IS vs what it is NOT
  • Separation: Start with whole (module outline), then parts (chunks)
  • Generalization: Show pattern across different contexts
  • Fusion: Integrate parts back into coherent whole

CRITICAL: Show concrete examples FIRST, then state general principles. Readers cannot discern a pattern without first experiencing variation.

Noweb File Format

Documentation Chunks

  • Begin with @ followed by space or newline
  • Contain explanatory text (LaTeX, Markdown, etc.)
  • Copied verbatim by noweave

Code Chunks

  • Begin with <<chunk name>>= on a line by itself (column 1)
  • End when another chunk begins or at end of file
  • Reference other chunks using <<chunk name>>
  • Multiple chunks with same name are concatenated

Syntax Rules

  • Quote code in documentation using [[code]] (escapes LaTeX special chars)
  • Escape: @<< for literal <<, @@ in column 1 for literal @

Writing Guidelines

  1. Start with the human story - problem, approach, design decisions

  2. Introduce concepts in pedagogical order - not compiler order

  3. Use meaningful chunk names - 2-5 word summary of purpose (like pseudocode)

  4. Reference variables in chunk names - when a chunk operates on a specific variable, use [[variable]] notation in the chunk name to make the connection explicit (e.g., <<add graders to [[graders]] list>>)

  5. Decompose by concept, not syntax

  6. Explain the "why" - don't just describe what the code does. Prose that merely restates the code in English teaches nothing. Good prose explains why a design choice was made: what alternative was rejected, what would break without this approach, or what constraint drives the implementation.

    Self-test: If your prose could be mechanically generated from the function signature, it's "what" not "why." Ask yourself: What design decision does this paragraph justify? What alternative did we reject and why? If the paragraph doesn't answer either question, rewrite it.

    BAD — prose restates code in English:

    \subsection{Counting $n$-grams}
    
    We count overlapping $n$-grams.
    If $n$ is larger than the input, the result is empty.
    
    <<functions>>=
    def ngram_counts(text, *, n):
        ...
    @
    

    GOOD — prose explains why this design choice:

    \subsection{Counting $n$-grams}
    
    We use overlapping $n$-grams because they capture all positional
    contexts---in \enquote{THE}, overlapping bigrams yield TH and HE,
    whereas non-overlapping would only yield TH.  This matches the
    standard definition used in cryptanalysis.
    
    <<functions>>=
    def ngram_counts(text, *, n):
        ...
    @
    

    Red flags that prose is "what" not "why":

    • Begins "We [verb] the [noun]" where the verb matches a function name
    • Describes parameter types or return values already in the signature
    • Restates conditional logic ("If X, we do Y") without explaining why X matters
  7. Keep chunks focused — one function per <<functions>>= chunk with prose before it. Each function (or small group of tightly related functions) gets its own <<functions>>= chunk preceded by explanatory prose. Never put multiple unrelated functions in a single chunk.

    BAD — four functions crammed into one chunk with minimal prose:

    \subsection{Helper Functions}
    
    We provide several utility functions.
    
    <<functions>>=
    def normalize_text(text): ...
    
    def letters_only(text): ...
    
    def key_shifts(key): ...
    
    def index_of_coincidence(text): ...
    @
    

    GOOD — each function with its own subsection and prose:

    \subsection{Text Normalization}
    
    Before analysis, we strip non-alphabetic characters and
    convert to lowercase so that frequency counts are meaningful.
    
    <<functions>>=
    def normalize_text(text): ...
    @
    
    \subsection{Index of Coincidence}
    
    The index of coincidence measures how likely two randomly
    chosen letters from a text are identical ...
    
    <<functions>>=
    def index_of_coincidence(text): ...
    @
    
  8. Decompose long functions into named sub-chunks — If a function has more than ~25 lines and contains two or more distinct algorithmic phases, decompose it into named sub-chunks. Each sub-chunk name should read like a step in an algorithm description. The prose before each sub-chunk explains why that phase works the way it does. This is the classic Knuth technique.

    BAD — 80-line function with one line of prose:

    We generate plaintext by concatenating sentences.
    
    <<functions>>=
    def generate_plaintext(size, *, sources, seed=None):
        """..."""
        if size <= 0:
            raise ValueError(...)
        paragraphs = extract_paragraphs(sources, ...)
        ...  # 75 more lines
        return normalize(prefix, options)
    @
    

    GOOD — function body decomposed into named sub-chunks with prose:

    <<functions>>=
    def generate_plaintext(size, *, sources, seed=None):
        """..."""
        <<prepare filtered paragraphs>>
        <<pick random starting point>>
        <<collect sentences until target length>>
        <<select closest sentence boundary>>
    @
    
    We extract paragraphs from the corpus, removing headings and ToC
    entries.  Paragraphs lacking sentence-ending punctuation are
    discarded---they are typically list items or table rows.
    
    <<prepare filtered paragraphs>>=
    if size <= 0:
        raise ValueError("size must be positive")
    ...
    @
    
    To avoid always starting at the beginning of the corpus, we
    rotate to a random paragraph.
    
    <<pick random starting point>>=
    rng = random.Random(seed)
    ...
    @
    
  9. Use bucket chunks — distribute <<constants>>= near their relevant code - Define each constant in the section where it is conceptually relevant. Never group all constants into a single \subsection{Constants}.

    BAD — all constants dumped in one subsection:

    \subsection{Constants}
    
    <<constants>>=
    DATA_DIR = ...        % used in loading section
    GUTENBERG_START = ... % used in extraction section
    SENTENCE_RE = ...     % used in sentence splitting section
    KEEP_PUNCT = ...      % used in normalization section
    @
    

    GOOD — each constant near the code that uses it:

    \subsection{Loading Texts}
    
    <<constants>>=
    DATA_DIR = Path(__file__).parent / "data"
    @
    
    <<functions>>=
    def load_text(path): ...
    @
    
    \subsection{Extracting Body Text}
    
    <<constants>>=
    GUTENBERG_START = "*** START OF"
    GUTENBERG_END = "*** END OF"
    @
    
    <<functions>>=
    def extract_body(text): ...
    @
    
  10. Define constants for magic numbers - never hardcode values

  11. Co-locate dependencies with features - feature's imports in feature's section

  12. Prefer public functions - Default to making functions public with docstrings. Only use _-prefixed private functions for true internal helpers tightly coupled to a single caller. Public utilities (e.g., normalize_text, letters_only) are reusable across modules and discoverable via help(). Duplicated private helpers across modules (e.g., _to_ascii in both vigenere.nw and plaintexts.nw) are a sign the function should be public in a shared module.

  13. Keep lines under 80 characters - both prose and code

LaTeX Documentation Quality

Apply latex-writing skill. Most common anti-patterns in .nw files:

Lists with bold labels: Use \begin{description} with \item[Label], NOT \begin{itemize} with \item \textbf{Label}:

Code with manual escaping: Use [[code]], NOT \texttt{...\_...}

Manual quotes: Use \enquote{...}, NOT "..." or ...'' ``

Manual cross-references: Use \cref{...}, NOT Section~\ref{...}

Progressive Disclosure Pattern

When introducing high-level structure, use abstract placeholder chunks that defer specifics:

def cli_show(user_regex,
             <<options for filtering>>):
  <<implementation>>
@

[... later, explain each option ...]

\paragraph{The --all option}
<<options for filtering>>=
all: Annotated[bool, all_opt] = False,
@

Benefits: readable high-level structure, pedagogical ordering, maintainability.

The same technique applies to function bodies: long functions can use <<phase name>> sub-chunks to present algorithmic steps in pedagogical order with prose between them (see Writing Guideline 8, "Decompose long functions").

Chunk Concatenation Patterns

Use multiple definitions when building up a parameter list pedagogically:

\subsection{Adding the diff flag}
<<args for diff>>=
diff=args.diff,
@

[... later ...]

\subsection{Fine-tuning thresholds}
<<args for diff>>=
threshold=args.threshold
@

Use separate chunks when contexts differ (different scopes):

<<args from command line>>=  # Has args object
diff=args.diff,
@

<<params for recursion>>=    # No args, only parameters
diff=diff,
@

Test Organization

CRITICAL: Tests MUST appear AFTER implementation, distributed throughout the file near the code they verify. NEVER create a \section{Tests} or \section{Unit Tests} that groups all tests at the end of the file.

See references/testing-patterns.md for detailed patterns.

Key rules:

  • Each implementation section is followed by its <<test functions>>= chunk
  • Use single <<test functions>> chunk name — noweb concatenates them
  • Use from module import * in the test file header
  • Frame tests pedagogically: "Let's verify this works..."

BAD — all tests collected at the end:

\section{Encryption}
<<functions>>=
def encrypt(text, key): ...
@

\section{Decryption}
<<functions>>=
def decrypt(text, key): ...
@

\section{Tests}          % ← NEVER do this

<<test functions>>=
def test_encrypt(): ...
def test_decrypt(): ...
@

GOOD — each test immediately after its implementation:

\section{Encryption}
<<functions>>=
def encrypt(text, key): ...
@

Let's verify that encryption produces the expected ciphertext:

<<test functions>>=
def test_encrypt(): ...
@

\section{Decryption}
<<functions>>=
def decrypt(text, key): ...
@

We can verify that decryption inverts encryption:

<<test functions>>=
def test_decrypt(): ...
@

Multi-Directory Projects

For large projects (5+ .nw files), see references/multi-directory-projects.md.

Key structure:

project/
├── Makefile       # Root orchestrator (compile → test → docs)
├── pyproject.toml # Poetry packaging configuration
├── src/           # .nw files → .py + .tex
├── doc/           # Document wrapper (.nw), preamble.tex
├── tests/         # Extracted test files (unit/ subdir)
└── makefiles/     # Shared build rules (noweb.mk, subdir.mk)

Initializing a New Project

See references/project-initialization.md for full details. Quick checklist:

  1. Create pyproject.toml with [tool.poetry] packages/include/exclude
  2. Create src/.gitignore (*.py, *.tex) and tests/.gitignore (*.py)
  3. Create src/packagename/Makefile with explicit __init__.py rule
  4. Create src/packagename/packagename.nw with <<[[__init__.py]]>> and <<test [[packagename.py]]>> chunks
  5. Create tests/Makefile with auto-discovery (uses %20 encoding, cpif, unit/ subdirectory)
  6. Create doc/packagename.nw wrapper, doc/Makefile, doc/preamble.tex
  7. Create root Makefile orchestrating compile → test → docs

LaTeX-Safe Chunk Names

Use [[...]] notation for Python chunks with underscores:

<<[[module_name.py]]>>=
def my_function():
    pass
@

Extract with: notangle -R"[[module_name.py]]" file.nw > module_name.py

Best Practices Summary

  1. Write documentation first - then add code

  2. Keep lines under 80 characters

  3. Check for unused chunks - run noroots to find typos

  4. Keep tangled code in .gitignore - .nw is source of truth

  5. NEVER commit generated files - .py and .tex from .nw are build artifacts

  6. Test your tangles - ensure extracted code runs

  7. Require PEP-257 docstrings on all public functions - Prose in .nw is for maintainers reading the literate source; docstrings are for users of the compiled .py who never see the .nw file. Both are needed. Private functions (prefixed _) may omit docstrings. Never use \cref or other LaTeX commands inside docstrings.

    BAD — function with prose but no docstring:

    We convert text to lowercase ASCII for uniform comparison.
    
    <<functions>>=
    def normalize_text(text):
        return text.lower().encode("ascii", "ignore").decode()
    @
    

    GOOD — prose for maintainers AND docstring for users:

    We convert text to lowercase ASCII for uniform comparison.
    
    <<functions>>=
    def normalize_text(text):
        """Return lowercase ASCII version of ``text``.
    
        Non-ASCII characters are silently dropped.
        """
        return text.lower().encode("ascii", "ignore").decode()
    @
    
  8. Include table of contents - add \tableofcontents in documentation

Git Workflow

See references/git-workflow.md for details.

Core rules:

  • Only commit .nw files to git
  • Add generated files to .gitignore immediately
  • Regenerate code with make after checkout/pull
  • Never commit generated .py or .tex files

Noweb Commands Quick Reference

See references/noweb-commands.md for details.

# Tangling
notangle -R"[[module.py]]" file.nw > module.py
noroots file.nw                              # List root chunks

# Weaving
noweave -n -delay -x -t2 file.nw > file.tex  # For inclusion
noweave -latex -x file.nw > file.tex         # Standalone

When Literate Programming Is Valuable

  • Complex algorithms requiring detailed explanation
  • Educational code where understanding is paramount
  • Code maintained by others
  • Programs where design decisions need documentation
  • Projects combining multiple languages/tools