Agent Skills: World Labs Single Image Input

Single image input for world generation - requirements, best practices, and examples

UncategorizedID: cloudai-x/world-labs-skills/world-labs-image-prompt

Install this agent skill to your local

pnpm dlx add-skill https://github.com/CloudAI-X/world-labs-skills/tree/HEAD/world-labs-image-prompt

Skill Files

Browse the full folder contents for world-labs-image-prompt.

Download Skill

Loading file tree…

world-labs-image-prompt/SKILL.md

Skill Metadata

Name
world-labs-image-prompt
Description
Single image input for world generation - requirements, best practices, and examples

World Labs Single Image Input

Generate 3D worlds from a single reference image. The model extrapolates the scene to create an immersive 360° environment.

Quick Reference

| Requirement | Specification | | ---------------------- | ---------------------------------- | | Recommended resolution | 1024px on long side | | Maximum file size | 20 MB | | Formats | PNG (recommended), JPG, WebP | | Aspect ratio | 16:9, 9:16, or anything in between |

Credits

| Input Type | Marble 0.1-plus | Marble 0.1-mini | | -------------- | --------------- | --------------- | | Standard image | 1,580 | 230 | | Panorama image | 1,500 | 150 |

Panoramas skip the pano generation step, saving 80 credits.

Best Practices

DO

Clear spatial definition: Images with obvious depth and perspective ✅ Wide shots: Show foreground, midground, and background ✅ Visible ground/floor: Helps establish world orientation ✅ Consistent lighting: Clear, well-lit scenes ✅ Environmental scenes: Landscapes, interiors, architectural spaces

DON'T

Close-up shots: Lack spatial context for 3D reconstruction ❌ People or animals: As main subjects (may cause artifacts) ❌ Abstract images: Non-representational art ❌ Borders or frames: Decorative edges, watermarks ❌ Heavy text overlays: Text doesn't render clearly ❌ Flat graphics: 2D illustrations without depth ❌ Blurry or low-contrast images: Reduces detail inference

Ideal Image Types

Excellent Results

  1. Landscape photography - Wide vistas with clear horizon and natural depth
  2. Architectural interiors - Rooms with visible floor/walls/ceiling
  3. Street scenes - Urban environments with buildings receding into distance
  4. Natural environments - Forests, caves, beaches with organic depth

Challenging (Use with Caution)

  • Indoor scenes with complex reflections
  • Very dark or overexposed images
  • Images with motion blur
  • Heavily edited/filtered photos

API Usage

Option 1: From Uploaded Media Asset

First upload your image (see world-labs-api skill), then:

{
  "model": "Marble 0.1-plus",
  "world_prompt": {
    "type": "image",
    "image_prompt": {
      "source": "media_asset",
      "media_asset_id": "550e8400-e29b-41d4-a716-446655440000"
    },
    "text_prompt": "Optional description to guide interpretation"
  }
}

Option 2: From Public URL

{
  "model": "Marble 0.1-plus",
  "world_prompt": {
    "type": "image",
    "image_prompt": {
      "source": "uri",
      "uri": "https://example.com/my-image.jpg"
    },
    "text_prompt": "A beautiful mountain landscape"
  }
}

Panorama Images (use is_pano flag)

For 360° equirectangular panoramas (2:1 aspect ratio):

{
  "model": "Marble 0.1-plus",
  "world_prompt": {
    "type": "image",
    "image_prompt": {
      "source": "media_asset",
      "media_asset_id": "550e8400-e29b-41d4-a716-446655440000",
      "is_pano": true
    }
  }
}

Upload Workflow

Step 1: Prepare Upload

curl -X POST "https://api.worldlabs.ai/marble/v1/media-assets:prepare_upload" \
  -H "WLT-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_name": "landscape.jpg",
    "kind": "image",
    "extension": "jpg"
  }'

Response:

{
  "media_asset": {
    "media_asset_id": "550e8400-e29b-41d4-a716-446655440000"
  },
  "upload_info": {
    "upload_url": "https://storage.googleapis.com/...",
    "upload_method": "PUT",
    "required_headers": {
      "x-goog-content-length-range": "0,1048576000"
    }
  }
}

Step 2: Upload Image

curl -X PUT "UPLOAD_URL" \
  -H "Content-Type: image/jpeg" \
  -H "x-goog-content-length-range: 0,1048576000" \
  --data-binary @landscape.jpg

Step 3: Generate World

curl -X POST "https://api.worldlabs.ai/marble/v1/worlds:generate" \
  -H "WLT-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Marble 0.1-plus",
    "world_prompt": {
      "type": "image",
      "image_prompt": {
        "source": "media_asset",
        "media_asset_id": "550e8400-e29b-41d4-a716-446655440000"
      },
      "text_prompt": "A dramatic mountain landscape at sunset"
    }
  }'

Python Example

import requests

def upload_image_and_generate(image_path: str, api_key: str, prompt: str = None, is_pano: bool = False):
    base_url = "https://api.worldlabs.ai/marble/v1"
    headers = {"WLT-Api-Key": api_key, "Content-Type": "application/json"}

    # Get extension
    ext = image_path.lower().split('.')[-1]
    if ext == "jpeg":
        ext = "jpg"

    # Step 1: Prepare upload
    prep_response = requests.post(
        f"{base_url}/media-assets:prepare_upload",
        headers=headers,
        json={"file_name": image_path.split('/')[-1], "kind": "image", "extension": ext}
    )
    prep_data = prep_response.json()
    media_asset_id = prep_data["media_asset"]["media_asset_id"]
    upload_url = prep_data["upload_info"]["upload_url"]

    # Step 2: Upload image
    content_types = {"jpg": "image/jpeg", "png": "image/png", "webp": "image/webp"}
    with open(image_path, 'rb') as f:
        requests.put(
            upload_url,
            headers={"Content-Type": content_types.get(ext, "image/jpeg")},
            data=f.read()
        )

    # Step 3: Generate world
    image_prompt = {"source": "media_asset", "media_asset_id": media_asset_id}
    if is_pano:
        image_prompt["is_pano"] = True

    world_prompt = {"type": "image", "image_prompt": image_prompt}
    if prompt:
        world_prompt["text_prompt"] = prompt

    gen_response = requests.post(
        f"{base_url}/worlds:generate",
        headers=headers,
        json={"model": "Marble 0.1-plus", "world_prompt": world_prompt}
    )

    return gen_response.json()["operation_id"]

# Usage
operation_id = upload_image_and_generate(
    "mountain_vista.jpg",
    "your_api_key",
    "Add dramatic storm clouds and golden sunset light"
)

Combining Text with Images

Text prompts guide interpretation and can transform the scene:

| Text Prompt | Effect | | ------------------------- | ------------------------------------------- | | None (omitted) | Auto-caption generated, faithful recreation | | "At sunset" | Changes lighting/atmosphere | | "In winter with snow" | Adds seasonal elements | | "Abandoned and overgrown" | Adds decay/nature reclaim | | "Futuristic version" | Sci-fi transformation |

When text is omitted, the model auto-generates a caption from the image.

Troubleshooting

| Issue | Solution | | ------------------ | ----------------------------------------------------------- | | Distorted geometry | Use image with clearer depth cues | | Missing areas | Provide image with more context/edges | | Wrong scale | Include recognizable objects for reference | | Artifacts on faces | Avoid images with people as main subject | | Billboard warping | Objects far from center may warp; center important elements |

Related Skills

  • world-labs-api - API integration details
  • world-labs-text-prompt - Text prompting best practices
  • world-labs-multi-image - Using multiple images with direction control
  • world-labs-pano-video - Panorama and video input
World Labs Single Image Input Skill | Agent Skills