Agent Skills: Gemini Batch API Skill

This skill should be used when the user asks to "use Gemini Batch API", "process documents at scale", "submit a batch job", "upload files to Gemini", or needs large-scale LLM processing. Includes production gotchas and best practices.

backendID: edwinhu/workflows/gemini-batch

Install this agent skill to your local

pnpm dlx add-skill https://github.com/edwinhu/workflows/tree/HEAD/skills/gemini-batch

Skill Files

Browse the full folder contents for gemini-batch.

Download Skill

Loading file tree…

skills/gemini-batch/SKILL.md

Skill Metadata

Name
gemini-batch
Description
This skill should be used when the user asks to "use Gemini Batch API", "process documents at scale", "submit a batch job", "upload files to Gemini", or needs large-scale LLM processing. Includes production gotchas and best practices.

Gemini Batch API Skill

Large-scale asynchronous document processing using Google's Gemini models.

When to Use

  • Process thousands of documents with the same prompt
  • Cost-effective bulk extraction (50% cheaper than synchronous API)
  • Jobs that can tolerate 24-hour completion windows

IRON LAW: Use Examples First, Never Guess API

READ EXAMPLES BEFORE WRITING ANY CODE. NO EXCEPTIONS.

The Rule

User asks for batch API work
    ↓
MANDATORY: Read examples/batch_processor.py or examples/icon_batch_vision.py
    ↓
Copy the pattern exactly
    ↓
DO NOT guess parameter names
DO NOT try wrapper types
DO NOT improvise API calls

Why This Matters

The Batch API has non-obvious requirements that will fail silently:

  1. Metadata must be flat primitives - Nested objects cause cryptic errors
  2. Parameter is dest= not destination= - Wrong name → TypeError
  3. Config is plain dict - Not a wrapper type
  4. Examples are authoritative - Working code beats assumptions

Rationale: Previous agents wasted hours debugging API errors that the examples would have prevented. The patterns in examples/ are battle-tested production code.

Rationalization Table - STOP If You Catch Yourself Thinking:

| Excuse | Reality | Do Instead | |--------|---------|------------| | "I know how APIs work" | You're overconfident about non-obvious gotchas | Read examples first | | "I can figure it out" | You'll waste 30+ minutes on trial-and-error | Copy working patterns | | "The examples might be outdated" | They're maintained and tested | Trust the examples | | "I need to customize anyway" | Your customization comes AFTER copying base pattern | Start with examples, then adapt | | "Reading examples takes too long" | You'll save 30 minutes debugging with 2 minutes of reading | Read examples first | | "My approach is simpler" | Your simpler approach already failed | Use proven patterns |

Red Flags - STOP If You Catch Yourself Thinking:

  • "Let me try destination= instead of dest=" → You're about to cause a TypeError. Read examples.
  • "I'll create a CreateBatchJobConfig object" → You're instantiating a type instead of using a plain dict. Stop.
  • "I'll nest metadata like a normal API" → You'll trigger BigQuery type errors. Flatten your data.
  • "This should work like other Google APIs" → Your assumption is wrong; this API is different.
  • "I'll figure out the JSONL format" → You'll waste time. Copy from examples instead.

MANDATORY Checklist Before ANY Batch API Code

  • [ ] Read examples/batch_processor.py OR examples/icon_batch_vision.py
  • [ ] Identify which example matches the use case (Standard API vs Vertex AI)
  • [ ] Copy the example's API call pattern exactly
  • [ ] Copy the example's JSONL structure exactly
  • [ ] Copy the example's metadata structure exactly
  • [ ] Adapt for specific needs only after copying base pattern

Enforcement: Writing batch API code without reading examples first violates this IRON LAW and will result in preventable errors.

Prerequisites

Install gcloud SDK

# macOS: Install Google Cloud SDK via Homebrew
brew install google-cloud-sdk

# Linux: Install Google Cloud SDK from official sources
curl https://sdk.cloud.google.com | bash

Authentication Setup

# Authenticate with Google Cloud Platform
gcloud auth login

# Set up Application Default Credentials for Python libraries
gcloud auth application-default login

# Enable Vertex AI API in your project
gcloud services enable aiplatform.googleapis.com

Why both auth methods?

  • gcloud auth login: For gsutil and gcloud CLI commands
  • gcloud auth application-default login: For google-generativeai Python library
  • CRITICAL: Vertex AI requires ADC (step 2), not just API key

Create GCS Bucket

# Create bucket in us-central1 (required region)
gsutil mb -l us-central1 gs://your-batch-bucket

# Verify bucket location is us-central1
gsutil ls -L -b gs://your-batch-bucket | grep "Location"

See references/gcs-setup.md for complete setup guide.

Quick Start

Standard Gemini API (API Key)

Uses the Gemini File API for input. Results returned via batch_job.dest.file_name.

from google import genai

client = genai.Client()  # Uses GOOGLE_API_KEY env var

# Upload JSONL to File API
uploaded = client.files.upload(
    file="requests.jsonl",
    config={"mime_type": "application/jsonl"}
)

# Submit batch job
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src=uploaded.name,  # "files/..." URI
    config={"display_name": "my-batch-job"}
)

# Results available at job.dest.file_name after completion

Vertex AI (Recommended for GCS workflows)

Uses GCS URIs directly. Supports dest= parameter for output location.

from google import genai

# Use Vertex AI with ADC (not API key)
client = genai.Client(
    vertexai=True,
    project="your-project-id",
    location="us-central1"
)

# Submit batch job with GCS paths
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src="gs://bucket/requests.jsonl",   # GCS input
    dest="gs://bucket/outputs/"          # GCS output (Vertex AI only!)
)

Key difference: Standard API uses File API (files/...), Vertex AI uses GCS (gs://...) with explicit dest= parameter.

Core Workflow

Standard API:

  1. Create JSONL request file with prompts
  2. Upload JSONL to File API via client.files.upload()
  3. Submit batch job via client.batches.create(src=uploaded.name)
  4. Poll for completion (jobs expire after 24 hours)
  5. Download results from job.dest.file_name

Vertex AI:

  1. Upload files to GCS bucket (us-central1 region required)
  2. Create JSONL request file with document URIs and prompts
  3. Submit batch job via client.batches.create(src=..., dest=...)
  4. Poll for completion (jobs expire after 24 hours)
  5. Download and parse results from GCS output URI
  6. Handle failures gracefully (partial failures are common)

IRON LAW: Metadata and API Call Structure

YOU MUST USE FLAT PRIMITIVES FOR METADATA. YOU MUST USE SIMPLE STRINGS FOR API PARAMETERS.

Rule 1: Metadata Structure

CORRECT ✓
"metadata": {
    "request_id": "icon_123",        # String
    "file_name": "copy.svg",         # String
    "file_size": 1024                # Integer
}

WRONG ✗
"metadata": {
    "request_id": "icon_123",
    "file_info": {                   # ← NESTED OBJECT FAILS!
        "name": "copy.svg",
        "size": 1024
    }
}

WORKAROUND (if complex data needed)
"metadata": {
    "request_id": "icon_123",
    "file_info": json.dumps({"name": "copy.svg", "size": 1024})  # JSON string OK
}

Why: Vertex AI stores metadata in BigQuery-compatible format. BigQuery doesn't support nested types. Violation causes: "metadata" in the specified input data is of unsupported type.

Rule 2: API Call Structure

Standard API (File API):

CORRECT ✓
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src=uploaded_file.name,               # "files/..." URI from File API
    config={"display_name": "my-job"}     # Just a dict
)
# Results at: job.dest.file_name (after completion)

Vertex AI (GCS):

CORRECT ✓
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src="gs://bucket/input.jsonl",        # GCS URI
    dest="gs://bucket/output/",           # GCS output (VERTEX AI ONLY!)
    config={"display_name": "my-job"}
)

WRONG ✗
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src="gs://bucket/input.jsonl",
    destination="gs://bucket/output/",    # ← WRONG PARAM NAME! Use dest=
)

WRONG ✗
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src="gs://bucket/input.jsonl",
    config=types.CreateBatchJobConfig(    # ← DON'T INSTANTIATE TYPES!
        dest="gs://bucket/output/"
    )
)

Why:

  • Standard API: Uses File API for input, outputs to managed file location
  • Vertex AI: Uses GCS URIs, supports dest= for output location
  • Parameter is dest= (not destination). Config is a plain dict (not a type instance).

Rationalization Table - STOP If You Catch Yourself Thinking:

| Excuse | Reality | Do Instead | |--------|---------|------------| | "Nested metadata is cleaner" | Your code will fail silently with cryptic errors | Flatten or use json.dumps() | | "I'll use dest= with Standard API" | Standard API doesn't support dest=; it's Vertex AI only | Use File API pattern for Standard API | | "I'll try destination= parameter" | You'll get a TypeError; parameter doesn't exist | Use dest= (Vertex AI only) | | "I should use CreateBatchJobConfig" | You're confusing internal typing with API calls | Pass plain dict to config= | | "Other APIs accept nested objects" | Your assumption breaks here; it's BigQuery-backed | Follow the examples | | "I'll fix it if it breaks" | Your job fails 5 minutes after submission | Get it right the first time |

Pre-Submission Validation

# Add this check BEFORE submitting batch job
def validate_metadata(metadata: dict):
    """Ensure metadata contains only primitive types."""
    for key, value in metadata.items():
        if isinstance(value, (dict, list)):
            raise ValueError(
                f"Metadata '{key}' is {type(value).__name__}. "
                f"Only primitives (str, int, float, bool) allowed. "
                f"Use json.dumps() for complex data."
            )
        if not isinstance(value, (str, int, float, bool, type(None))):
            raise ValueError(f"Unsupported type for '{key}': {type(value)}")

# Validate all requests before submission:
for request in batch_requests:
    validate_metadata(request["metadata"])

Enforcement: Jobs will fail if metadata contains nested objects. There is no workaround for this requirement.

Key Gotchas

| Issue | Solution | |-------|----------| | Nested metadata fails | Use flat primitives or json.dumps() for complex data | | TypeError: unexpected keyword | Use dest= not destination= (Vertex AI only) | | Mixing API patterns | Standard API: File API + no dest. Vertex AI: GCS + dest | | Auth errors with Vertex AI | Run gcloud auth application-default login | | vertexai=True requires ADC | API key is ignored with vertexai=True | | Missing aiplatform API | Run gcloud services enable aiplatform.googleapis.com | | Region mismatch (Vertex) | Use us-central1 bucket only | | Wrong URI format (Vertex) | Use gs:// not https:// | | Invalid JSONL | Use scripts/validate_jsonl.py | | Image batch: inline data | Use fileData.fileUri for batch, not inline | | Duplicate IDs | Hash file content + prompt for unique IDs | | Large PDFs fail | Split at 50 pages / 50MB max | | JSON parsing fails | Use robust extraction (see gotchas.md) | | Output not found (Vertex) | Output URI is prefix, not file path |

Top 3 mistakes (bolded above):

  1. Using nested objects in metadata instead of flat primitives
  2. Mixing Standard API and Vertex AI patterns
  3. Using destination= instead of dest= (Vertex AI)

See references/gotchas.md for detailed solutions (now with Gotchas 10 & 11).

Rate Limits

| Limit | Value | |-------|-------| | Max requests per JSONL | 10,000 | | Max concurrent jobs | 10 | | Max job size | 100MB | | Job expiration | 24 hours |

Recommended Models

| Model | Use Case | Cost | |-------|----------|------| | gemini-2.5-flash-lite | Most batch jobs | Lowest | | gemini-2.5-flash | Complex extraction | Medium | | gemini-2.5-pro | Highest accuracy | Highest |

Additional Resources

References

  • references/gcs-setup.md - NEW: Complete GCS and Vertex AI setup guide
  • references/gotchas.md - 9 critical production gotchas (updated auth section)
  • references/best-practices.md - Idempotent IDs, state tracking, validation
  • references/scale-up-testing.md - Incremental scale-up testing (LangExtract prototyping, LLM-as-judge, Vertex AI batch)
  • references/troubleshooting.md - Common errors and debugging
  • references/vertex-ai.md - Enterprise alternative with comparison
  • references/cli-reference.md - gsutil and gcloud commands

Examples

  • examples/icon_batch_vision.py - NEW: Batch vision analysis with Vertex AI
  • examples/batch_processor.py - Complete GeminiBatchProcessor class
  • examples/pipeline_template.py - Customizable pipeline template

Scripts

  • scripts/validate_jsonl.py - Validate JSONL before submission
  • scripts/test_single.py - Test single request before batch

External Documentation

Date Awareness

Pattern from oh-my-opencode: Gemini API and documentation evolve rapidly.

Current date: Use datetime.now() for:

  • API version checking
  • Model availability ("gemini-2.5-flash-lite available as of Dec 2024")
  • Documentation freshness validation

For API features or model names with uncertainty, verify against current date and check latest Gemini API documentation.