Image Generation Skill | Agent Skills

Image Generation

Generate, edit, and upscale images with standardized quality tiers and embedded best practices.

Quick Start

Need image?
├─ Text/Logo → bun scripts/gen.ts "..." --text [-t tier]
├─ Photo/Art → bun scripts/gen.ts "..." [-t tier]
├─ Edit existing → bun scripts/edit.ts <img> "..." [-t tier]
├─ Upscale → bun scripts/upscale.ts <img> [-t tier]
├─ Vectorize → bun scripts/svg.ts <img> ($0.01/img)
└─ Remove BG → bun scripts/rembg.ts <img> (FREE)

Tier selection:
├─ iterate  → FREE drafts (~96/day via Cloudflare)
├─ default  → Daily driver ($0.008/MP)
├─ premium  → Final assets ($0.03/MP)
└─ max      → Critical work, SOTA ($0.06-0.07/MP)

Entry Points

| Script | Purpose | |--------|---------| | bun scripts/gen.ts | Text → Image | | bun scripts/edit.ts | Image + Instruction → Image | | bun scripts/upscale.ts | Image → Larger Image | | bun scripts/svg.ts | Image → SVG ($0.01/img) | | bun scripts/rembg.ts | Remove background (FREE) |

Prompting Best Practices

CRITICAL: Good prompts are the difference between unusable output and production-ready assets.

The Universal Prompt Structure

[Subject] + [Action/Pose] + [Environment] + [Style/Medium] + [Lighting] + [Camera/Composition]

Example:

"A cybernetic owl perched on a neon sign in a rain-soaked alley. Cinematic lighting with teal and orange highlights. Shot on 35mm film, shallow depth of field, hyper-detailed textures."

DO: Effective Prompting

| Technique | Example | |-----------|---------| | Be specific | "middle-aged man with salt-and-pepper hair wearing charcoal turtleneck" NOT "a man" | | Describe the result | "person with clear eyes" NOT "remove glasses" | | Use camera terms | "Shot on Hasselblad, 85mm lens, f/1.8" | | Specify lighting | "golden hour rim lighting with deep shadows" | | Include textures | "weathered sandstone", "anodized aluminum", "iridescent silk" |

DON'T: Common Mistakes

| Mistake | Problem | Fix | |---------|---------|-----| | Negative phrasing | "no glasses" often adds glasses | Describe what IS there | | Vague subjects | AI interprets randomly | Be exhaustively specific | | Keyword salad | "4k, trending, masterpiece" is noise | Use descriptive sentences | | Short prompts | Under 20 words underperforms | Aim for 40-80 words |

Style Keywords That Work

| Category | Keywords | |----------|----------| | Lighting | golden hour, volumetric lighting, Rembrandt lighting, neon rim light, bioluminescent | | Camera | 35mm anamorphic, macro photography, tilt-shift, fisheye, drone shot | | Style | cinematic, photorealistic, concept art, ukiyo-e, baroque, impressionist | | Quality | hyper-detailed, sharp focus, 8k resolution, raytraced |

Text & Logo Generation (--text flag)

Uses Recraft V3 (iterate/default) or Ideogram V3 (premium/max) - specialized for typography.

Text Prompting Rules

CRITICAL: Put text in "Double Quotes" at the START of your prompt.

# Correct - text first, then describe
bun scripts/gen.ts '"QUANTUM" in bold futuristic font, metallic silver, dark space background' --text

# Wrong - text buried in description
bun scripts/gen.ts 'A logo with the word QUANTUM on it' --text

Logo Design Patterns

| Style | Prompt Pattern | |-------|----------------| | Minimalist | "BRAND" minimalist vector logo, clean lines, simple geometry, flat design | | Vintage | "EST. 1920" vintage badge logo, circular emblem, ribbon banner, ornate border | | Negative space | "PEAK" logo where the letter A forms a mountain, negative space design | | 3D/Modern | "TECHCORP" bold 3D chrome letters, gradient fill, dark background |

Font Specification

Use typography terms: modern sans-serif, elegant script, bold blocky, blackletter, neon tubing, retro 70s serif

DO/DON'T for Text

| DO | DON'T | |----|-------| | "Three cats playing" (exact count) | "cats playing" (random count) | | "wooden baseball bat" (specific) | "bat" (ambiguous) | | Describe only what you want | "no cake" (will add cake) |

Image Editing

bun scripts/edit.ts <image> <instruction> [-t TIER] [--mask <mask.png>] [--ref <img>...]

Writing Edit Instructions

Key: Describe the TARGET STATE, not the change.

| Bad Instruction | Good Instruction | |-----------------|------------------| | "change car to blue" | "A sleek blue metallic sports car, reflections of neon lights on wet asphalt" | | "add a hat" | "person wearing a vintage red fedora, matching the scene lighting" | | "remove background" | Use rembg.ts instead (FREE and better) |

Mask Best Practices

| Task | Mask Strategy | |------|---------------| | Object removal | Mask LARGER than object (10-20px margin) for seamless fill | | Object addition | Mask exact shape or slightly smaller | | Outpainting | Overlap 10-20px INTO original image |

Feathering: Apply 12-16px blur to masks. Sharp masks = visible seams.

Multi-Reference Editing (--ref)

Using 2+ reference images auto-selects max tier (flux-2-flex).

# Style transfer: apply reference style to base image
bun scripts/edit.ts base.jpg "in the style of the reference" --ref style.jpg

# Multi-reference blending
bun scripts/edit.ts scene.jpg "forest sofa scene" --ref forest.jpg --ref sofa.jpg

Tip: When blending references, describe their relationship: "A velvet sofa placed in a misty pine forest"

Upscaling

bun scripts/upscale.ts <image> [-t TIER] [--scale 2|4]

When to Use 2x vs 4x

| Source Quality | Recommendation | |----------------|----------------| | High (RAW, clean PNG) | 4x safe - AI infers detail accurately | | Medium (standard JPEG) | 2x preferred - denoise first if possible | | Low (compressed, blurry) | 2x max - noise gets magnified |

Use Case Guidelines

| Output | Scale | Notes | |--------|-------|-------| | Web/UI | 2x | Reduces file size, improves perceived sharpness | | Print (300 DPI) | 4x | Target 300 DPI for print quality | | Icons/Logos | 2x | Use svg.ts instead for infinite scaling |

Common Artifacts & Fixes

| Artifact | Cause | Prevention | |----------|-------|------------| | Haloing (white edges) | Aggressive sharpening | Use iterate/default tier | | Plasticky skin | Over-smoothing | Reduce to 2x, use premium tier | | Grid patterns | Tile processing | Use higher tier models |

Rule of Thumb: If image looks "crunchy" at 100% zoom, don't exceed 2x.

Tier Selection Guide

| Scenario | Tier | Why | |----------|------|-----| | Exploring 10+ variations | iterate | FREE, fast iteration | | Daily work, 3-5 variations | default | Best cost/quality balance | | Client deliverables | premium | Higher fidelity | | Critical assets, multi-ref | max | SOTA quality, advanced features | | Text/logos (any) | default | Recraft V3 already excellent | | Text/logos (critical) | premium | Ideogram V3 for perfect typography |

Cost Optimization

EXPENSIVE WORKFLOW (avoid):
  Generate at max tier → iterate on max → deliver

COST-EFFECTIVE WORKFLOW (recommended):
  Generate at iterate (FREE) → find best concept
  → Regenerate winner at default/premium → deliver

Environment

# For FREE iterate generation (Cloudflare)
CLOUDFLARE_ACCOUNT_ID=xxx
CLOUDFLARE_API_TOKEN=xxx

# For paid tiers (Fal.ai)
FAL_API_KEY=xxx

Quota: Cloudflare FREE tier allows ~96 images/day at 1024x1024.

Exit Codes

| Code | Meaning | Action | |------|---------|--------| | 0 | Success | Image saved to .ada/data/images/ | | 1 | General error | Check error message | | 2 | Config/auth error | Verify API keys in .env | | 3 | Resource limit | Quota exceeded - wait 24h or use paid tier |

CRITICAL: Exit code 3 does NOT fall back to paid tier. This prevents accidental charges.

Integration

| Skill | When to Use Together | |-------|---------------------| | ui-animation | Animate generated images for web/mobile | | docs-write | Document image assets and parameters used | | search | Find prompting resources and style references | | code-quality | After modifying skill scripts |

References

references/usage-guide.md - Extended prompting guide, error codes, testing
README.md - Architecture diagrams, model reference, CLI details
Fal.ai Docs - Official API documentation

Output

Images saved to .ada/data/images/ with timestamped filenames:

20260118_gen_default_cyberpunk_city.jpg
20260118_svg_default_logo_vector.svg

Agent Skills: Image Generation

Install this agent skill to your local

Skill Files