Agent Skills: Image Generator

|

UncategorizedID: aiskillstore/marketplace/image-generator

Install this agent skill to your local

pnpm dlx add-skill https://github.com/aiskillstore/marketplace/tree/HEAD/skills/92bilal26/image-generator

Skill Files

Browse the full folder contents for image-generator.

Download Skill

Loading file tree…

skills/92bilal26/image-generator/SKILL.md

Skill Metadata

Name
image-generator
Description
|

Image Generator

Generate professional teaching visuals using Gemini 3 with multi-turn reasoning partnership.

Quick Start

# 1. Start browser (via browser-use skill)
bash .claude/skills/browser-use/scripts/start-server.sh

# 2. Navigate to Gemini
# Use browser_navigate to https://gemini.google.com/

# 3. Generate image from creative brief
# Paste creative brief β†’ Wait 30-35s β†’ Verify 6 gates β†’ Download

Core Principles

  1. Reasoning over prediction - Creative briefs (Story/Intent/Metaphor) activate reasoning; pixel specs don't
  2. Multi-turn partnership - Teach Gemini your standards through principle-based feedback
  3. 6-gate quality - Explicit pass/fail before download
  4. Autonomous batch - No permission-asking between visuals

Input: Creative Brief Format

Receive from visual-asset-workflow:

## The Story
[Narrative about what's visualized]

## Emotional Intent
[What it should FEEL like]

## Visual Metaphor
[Universal concept for instant comprehension]

## Subject / Composition / Action / Location / Style
[Gemini 3 prompt structure]

## Color Semantics
Blue (#2563eb) = Authority | Green (#10b981) = Execution

## Typography Hierarchy
Largest: Key insight | Medium: Supporting | Smallest: Context

Do NOT convert to pixel specs - use as-is to activate reasoning.

Workflow (Per Visual)

| Step | Action | Tool | |------|--------|------| | 1 | Navigate to gemini.google.com | browser_navigate | | 2 | Select "🍌 Create Image" | browser_click | | 3 | Paste creative brief | browser_type | | 4 | Wait 30-35 seconds | browser_wait_for | | 5 | Verify 6 gates (below) | Visual inspection | | 6 | If fail: Iterate with feedback (max 3) | browser_type | | 7 | If pass: Download full size | browser_click | | 8 | Copy to apps/learn-app/static/img/part-{N}/chapter-{NN}/ | Bash | | 9 | Embed in lesson immediately | Edit | | 10 | NEW CHAT for next visual | browser_navigate |

Quality Gates (ALL Must Pass)

| Gate | Criterion | Fail Action | |------|-----------|-------------| | 1. Spelling | 99% accuracy (Y-Combinator, Kubernetes) | Iterate | | 2. Layout | Proportions match prompt (2Γ—2 not 3Γ—1) | Iterate | | 3. Color | Brand colors match (#2563eb not #002050) | Iterate | | 4. Typography | Largest = key concept (not decoration) | Iterate | | 5. Teaching | <5 sec concept grasp at target proficiency | Iterate | | 6. Uniqueness | Not duplicate of existing chapter image | New chat |

Decision: ALL pass β†’ Download | ANY fail β†’ Iterate (max 3 tries)

Iteration: Principle-Based Feedback

When gate fails, provide teaching feedback:

Gate 4 FAILED: Typography hierarchy incorrect

The largest text is "$100K" (supporting detail) but should be "$3T"
(key insight students must grasp).

Increase '$3T' to dominant size. Reduce '$100K' to supporting size.
Information importance drives sizing.

Batch Mode

When invoked with "generate all visuals":

For EACH visual in list:
  A. NEW CHAT (context isolation)
  B. Generate (paste brief)
  C. Verify 6 gates
  D. Iterate if needed (max 3)
  E. Download when pass
  F. Embed in lesson
  G. Log "βœ… N/M"
  H. NEXT (no stopping)

Never ask: "Continue?" "Pause here?" "Review?"

Report at END only:

BATCH COMPLETE
βœ… Generated: 16/18
⚠️ Deferred: 2 (quality issues)
Location: apps/learn-app/static/img/part-{N}/

Proficiency Limits

| Level | Max Elements | Grasp Time | |-------|--------------|------------| | A2 | 5-7 | <5 sec | | B1 | 7-10 | <10 sec | | C2 | No limit | N/A |

Token Conservation (Batch Mode)

For >8 visuals, condense briefs:

Original (250 tokens):

"Top Layer shows Coordinator at center top with label 'Orchestrator'
featuring conductor icon, with role 'Strategic oversight'..."

Condensed (80 tokens):

"Top Layer - Coordinator: Center top, 'Orchestrator' (conductor),
Role: 'Strategic oversight', Gold (#fbbf24), Large hexagon."

Keep: Story, Intent, Metaphor, Colors, Reasoning Condense: Long examples β†’ Short labels

Anti-Patterns

| Don't | Why | |-------|-----| | Accept first output without 6 gates | Quality standard violation | | Ask permission between batch items | Breaks autonomous agency | | Convert briefs to pixel specs | Defeats reasoning activation | | Skip embedding step | Creates orphan images | | Reuse same chat for next visual | Context contamination |

Session Interruption

If session ends mid-batch, create checkpoint:

# Checkpoint: Part {N}
Status: INTERRUPTED at 8/18

## Completed:
- βœ… Image 1: filename (embedded lesson-01.md)
- βœ… Image 2: filename (embedded lesson-02.md)

## Remaining:
- ⏳ Image 8: filename

On continuation: Read checkpoint β†’ Resume β†’ Update incrementally

Success Indicators

  • βœ… All 6 gates verified before download
  • βœ… Batch completion without permission-asking
  • βœ… Principle-based iteration feedback
  • βœ… Images organized by part/chapter
  • βœ… Immediate embedding (no orphans)
  • βœ… >85% production-ready rate