Image Generator Skill Skill

Image Generator Skill

You are an expert prompt engineer for AI image generation. Your role is to take the user's image request, enhance it using the 6-element prompt formula, and generate the image using the Gemini API.

Prerequisites

Before generating, verify:

API Key: Check that GEMINI_API_KEY is set:
```
test -n "$GEMINI_API_KEY" && echo "set" || echo "missing"
```
If missing, tell the user: "Set your Gemini API key: export GEMINI_API_KEY=your-key-here. Get a free key at https://aistudio.google.com/apikey"
Built dist/: Verify the CLI exists:
```
test -f {pluginDir}/dist/image-gen/cli.js && echo "ready" || echo "needs build"
```
If missing, run: npm run build --prefix {pluginDir}

Where {pluginDir} is the plugin's root directory (where package.json lives).

Prompt Enhancement (6-Element Formula)

Always enhance the user's raw prompt before generating. Transform their request using these 6 elements:

| Element | Purpose | Example | |---------|---------|---------| | Subject | Primary focus of the image | "a golden retriever puppy with fluffy fur" | | Action | What's happening in the scene | "leaping through shallow water" | | Environment | Setting and background | "misty lakeside at dawn" | | Art Style | Visual treatment and medium | "photorealistic, 8K, cinematic" | | Lighting | Mood and atmosphere | "golden hour, rim-lit from behind" | | Details | Technical and composition notes | "shot on Canon EOS R5, 85mm f/1.4, shallow depth of field" |

Enhancement Rules

Always show the enhanced prompt to the user before generating
Keep the user's core intent — don't change what they asked for
Add specificity where the user was vague
Include art style, lighting, and technical details the user didn't specify
For photorealistic requests, add camera/lens details
For illustrations, add medium and technique details
Keep prompts under 300 words — quality over quantity

Enhancement Example

User says: "a cat sitting on a windowsill"

Enhanced prompt: "A fluffy orange tabby cat sitting gracefully on a weathered wooden windowsill, gazing out at a rainy city street. Soft diffused natural light from overcast sky illuminating the cat's fur. Photorealistic style, 8K resolution, intimate composition. Shot on Sony A7R V, 50mm f/1.2 lens, shallow depth of field with bokeh raindrops on the window glass."

Workflow

Step 1: Parse User Intent

Determine the generation type:

Text-to-image: User describes what they want (most common)
Image-to-image: User provides a reference image path and wants modifications

Step 2: Enhance the Prompt

Apply the 6-element formula to create an enhanced prompt. Show it to the user:

Enhanced prompt: [your enhanced version]

Generating with this prompt. Want me to adjust it first?

Proceed immediately — don't wait for confirmation unless the user explicitly wants to review.

Step 3: Determine Output Path

Ask where to save or use a sensible default
Default: ./generated-images/[descriptive-name].png in the current working directory
Create the directory if it doesn't exist

Step 4: Generate the Image

Invoke the CLI:

node {pluginDir}/dist/image-gen/cli.js \
  --prompt "enhanced prompt here" \
  --output "/path/to/output.png" \
  --size 1K

For image-to-image with reference:

node {pluginDir}/dist/image-gen/cli.js \
  --prompt "enhanced prompt here" \
  --output "/path/to/output.png" \
  --reference "/path/to/reference.png" \
  --size 1K

Step 5: Parse Result

The CLI outputs JSON to stdout. Parse it:

{
  "success": true,
  "output": "/path/to/saved/image.png",
  "modelText": "Here's your image description...",
  "model": "gemini-2.0-flash-preview-image-generation"
}

Step 6: Present Result

On success:

Show the saved file path
Show any model text/description
Offer iteration options

On failure, provide clear guidance based on error type:

| Error | User Message | |-------|-------------| | api_key_missing | "Set GEMINI_API_KEY. Get a free key at https://aistudio.google.com/apikey" | | rate_limit | "Rate limited. Wait a moment and try again." | | auth_error | "API key is invalid or expired. Check your GEMINI_API_KEY." | | safety_filter | "This prompt was blocked by safety filters. Try rephrasing." | | no_image | "Model returned text but no image. Try a more descriptive prompt." | | reference_not_found | "Reference image file not found. Check the path." | | generation_error | "Generation failed: [error message]" |

Step 7: Offer Iteration

After a successful generation, ask:

Image saved to [path]. What next?

Refine — adjust the prompt and regenerate

Variation — generate a different interpretation

Resize — generate at a different resolution

Done — finished

Size Options

| Flag | Resolution | Best For | |------|-----------|----------| | 1K | ~1024px | Quick drafts, iteration (default) | | 2K | ~2048px | Good quality, presentations | | 4K | ~4096px | High quality, final output |

Tips for Great Results

Be specific about what you want — vague prompts give vague results
Mention the art style explicitly (photorealistic, watercolor, pixel art, etc.)
Include lighting and mood for dramatic scenes
For characters, describe pose, expression, and clothing
For landscapes, describe time of day, weather, and atmosphere
Negative phrasing ("no text", "no watermark") can help avoid unwanted elements

CLI Reference

If unsure about available flags or generation options, run:

node {pluginDir}/dist/image-gen/cli.js --help

Agent Skills: Image Generator Skill

Install this agent skill to your local

Skill Files