Image Explore - Visual Direction Brainstorming Skill

Image Explore - Visual Direction Brainstorming

Generate multiple distinct visual directions for a blog image, render them all in parallel, build a comparison page, and optionally publish as a shareable link for feedback.

Arguments

Parse the user's input for:

Target: A file path (e.g., _d/ai-native-manager.md) or a freeform topic (e.g., "chaos of AI adoption")
--count N: Number of directions to brainstorm and generate (default: 5, max: 8)
--variants N: Number of minor variants per direction (default: 1, max: 3). Each variant tweaks the scene (different angle, lighting, composition) while keeping the same concept and shirt text.
--style 'description': Override the default illustration style (passed through to gen-image)
--aspect 'W:H': Aspect ratio (default: 3:4). Valid: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
--ref 'path': Override reference image (default: raccoon canonical ref)

Workflow

Phase 1: Analyze Content

If the target is a file path:

Read the file
Identify the hook — what's the one idea a reader should remember?
Note key metaphors, section themes, emotional arc
Check for existing images (look for imagefeature, local_image, blob_image includes)

If the target is a freeform topic:

Use it directly as the creative brief
Skip to Phase 2

Phase 2: Brainstorm Directions

This is the creative core. Generate --count distinct visual directions. Each direction must have:

Name: 2-4 evocative words (e.g., "Circus Ringmaster", "Surfing the Wave")
Section: Which part of the post it maps to (or "standalone")
Scene: One-sentence description of the image
Vibe: What feeling it evokes (e.g., "controlled chaos", "quiet confidence")
Shirt text: For raccoon style, what the shirt reads (max 8 chars)

Directions must be meaningfully different. Vary across these axes:

Literal vs. metaphorical
Action vs. stillness
Humor vs. gravitas
Individual vs. group scene
Indoor vs. outdoor / grounded vs. fantastical

Avoid generating 5 variations of the same idea. If the post has one dominant metaphor, use it for at most 2 directions and find fresh angles for the rest.

Present directions as a table:

| # | Name | Section | Scene | Vibe | Shirt | | --- | ---------------- | ------------- | ---------------------------------------- | -------------- | ------- | | A | Mission Control | Year of Chaos | Raccoon at NASA console, screens on fire | "This is fine" | SHIP IT | | B | Surfing the Wave | AI Adoption | Raccoon surfing tidal wave of AI debris | Riding chaos | SHIP IT |

Confirm with user via AskUserQuestion before generating. User may add, remove, or modify directions.

If --variants > 1: After user approves the directions, craft variant scenes for each. Each variant keeps the same concept, shirt text, and vibe, but varies the specific scene description (different angle, setting detail, composition, or lighting). Do NOT present variant scenes for approval — just generate them.

Phase 3: Generate Images in Parallel

Resolve the script path once:

CHOP_ROOT="$(cd "$(dirname "$(readlink -f ~/.claude/skills/image-explore/SKILL.md)")" && git rev-parse --show-toplevel)"
GEN="$CHOP_ROOT/skills/image-explore/generate.py"

Write a directions.json file with all directions (used by both Phase 3 and Phase 4).

Without variants (1 entry per direction):

[
  {
    "name": "Mission Control",
    "section": "Year of Chaos",
    "vibe": "This is fine",
    "shirt": "SHIP IT",
    "scene": "Raccoon at NASA console, screens showing fire",
    "output": "mission-control.webp"
  }
]

With variants (multiple entries per direction, grouped by group field):

[
  {
    "name": "Mission Control v1",
    "group": "Mission Control",
    "section": "Year of Chaos",
    "vibe": "This is fine",
    "shirt": "SHIP IT",
    "scene": "Raccoon at NASA console, screens showing fire, dramatic front view",
    "output": "mission-control-v1.webp"
  },
  {
    "name": "Mission Control v2",
    "group": "Mission Control",
    "section": "Year of Chaos",
    "vibe": "This is fine",
    "shirt": "SHIP IT",
    "scene": "Raccoon at NASA console seen from side, leaning back in chair sipping tea",
    "output": "mission-control-v2.webp"
  }
]

The group field enables build-page.py to group variants under a shared heading. Output filenames follow the pattern {slug}-v{N}.webp when using variants.

Scene-first prompt ordering ("scene_first": true):

By default, prompts are assembled as: character style → "large & prominent 40%" → scene. This works well for character-focused images but fights wide-field compositions where characters should be small elements in a larger scene.

Set "scene_first": true on any direction where the scene composition matters more than character prominence. This reorders the prompt to: scene → character style → shirt text, and drops the "40% of image" instruction. Use it for:

Bird's-eye/aerial views of fields, landscapes, maps
Group scenes where many characters are small
Any composition where the environment dominates

{
  "name": "Overhead Field",
  "scene": "Aerial drone shot of a soccer field with raccoons scattered across it...",
  "shirt": "NEXT",
  "output": "overhead-field.webp",
  "scene_first": true
}

Generate all images in parallel with a single command:
```
uv run "$GEN" batch directions.json
```
Pass --aspect, --ref, or --style if overriding defaults. The script handles env loading, prompt assembly, ref image resolution, and parallel execution via thread pool (secrets never leak into command strings).
After batch completes, the directions JSON is automatically augmented with _prompt and _duration_s fields for each entry (used by the comparison page for debug info).

Phase 3b: Verify & Retry

After generation, verify each image actually matches its scene description before showing to the user. This catches cases where Gemini ignores complex scene descriptions (e.g., split-screens, multiple characters, specific compositions).

Launch background sub-agents in parallel (one per image) to verify each result. Each agent should:
- Read the generated image file (Claude has vision)
- Compare what it sees against the scene description
- Check for:
  - Scene composition: Does the layout match? (e.g., split-screen actually split, multiple characters present)
  - Key elements: Are the described elements visible? (e.g., dust cloud, trajectory arc, binoculars)
  - Shirt text: Is it readable and roughly correct?
- Return a verdict: pass (scene clearly rendered) or fail with a short explanation of what's wrong
Collect results from all agents. This runs concurrently and doesn't block other work.
Write verification results back to the directions JSON — for each entry, add:
- _verification: "pass" or "fail"
- _verification_reason: short explanation (e.g., "Solo raccoon portrait, no soccer field or group scene") These fields are rendered in the comparison page's collapsible debug details.
For any failures, retry up to 2 times:
- Strengthen the scene description (be more explicit, add emphasis)
- Re-run generate.py for just the failed entries (write a temporary batch JSON)
- Launch verification agents again for the retried images
- Update _verification and _verification_reason with the retry result
After retries, report any still-failing images to the user rather than silently including bad results

What counts as a failure:

Single character when scene calls for a group (or vice versa)
Missing the core concept entirely (e.g., "split-screen" rendered as single scene)
Wrong setting (indoor when outdoor was specified)

What does NOT count as a failure:

Shirt text slightly wrong (Gemini often struggles with exact text)
Style differences from the reference
Minor composition differences (angle, lighting)

Show all verified images to the user with the Read tool.

Phase 4: Build Comparison Page

Build and serve the comparison page (reuses the same directions.json — build-page.py reads name/section/vibe/shirt and accepts either image or output for the file path):

uv run "$CHOP_ROOT/skills/image-explore/build-page.py" \
  --title "Image Explore: Topic Name" \
  --dir docs/image-explore-topic/ \
  --images-dir images/ \
  directions.json

Options:

--images-dir PATH: Where to find generated images (default: current directory). Useful when images were written to a different directory than where you run the command (e.g., images/).
--no-serve: Skip starting the HTTP server.

This creates the showboat doc, converts images, generates HTML via pandoc, and starts a local HTTP server. It prints the Tailscale URL.

When directions.json contains entries with a group field, the page groups variants under shared direction headings with sub-headers for each variant.

Phase 5: Publish (Ask First)

Ask the user: "Want to publish this as a shareable link?" and offer two options:

Surge.sh (Recommended) — Full HTML/CSS/JS support, lightbox clicking works
GitHub Gist — Simpler but gisthost may block inline JS (lightbox won't work)

Option A: Surge.sh

# Prepare deploy directory (random path to avoid accidental overwrites)
SURGE_DIR=$(mktemp -d /tmp/surge-XXXXXXXX)
cp <docs-dir>/demo.html "$SURGE_DIR/index.html"
cp <docs-dir>/*.png "$SURGE_DIR/"

# Deploy (pick a descriptive subdomain)
surge "$SURGE_DIR" <descriptive-name>.surge.sh

Option B: GitHub Gist

Uses the gist-image skill technique (create gist, clone, push binary files via git) plus gisthost-specific URL rewriting. The helper script automates this:

uv run "$CHOP_ROOT/skills/image-explore/publish-gist.py" demo.html --title "Description"

This handles: gist creation, image conversion to JPEG, URL rewriting, git push. It prints the gisthost URL. Note: gisthost may block inline <script> tags, so the lightbox/click-to-expand won't work.

Phase 6: Apply Selection (Optional)

If the user picks a winner, offer to:

Update the blog post's imagefeature frontmatter
Update any local_image_float_right includes
Copy the chosen image to the blog's images directory with a permanent name

Default Raccoon Style

When no --style is given, generate.py reads the canonical style from ../gen-image/raccoon-style.txt. Edit that file to change the default for both gen-image and image-explore — do not duplicate the style text here (drifted copies rot silently).

Error Handling

Missing API key: Check ~/.env first, then stop and ask user
Generation failure: Report which direction failed, ask to retry or skip
showboat not installed: Fall back to plain markdown + pandoc without showboat
magick not installed: Warn, use PNG directly (larger files)

Tips

After user picks a winner, they may want to iterate on it — offer to re-run gen-image with refinements
If a direction doesn't render well, suggest prompt tweaks rather than just re-rolling
Keep the comparison page around for reference even after picking a winner

Agent Skills: Image Explore - Visual Direction Brainstorming

Install this agent skill to your local

Skill Files