Vertex AI Skill
Call Google Vertex AI models from the CLI using gcloud OAuth2 authentication.
Prerequisites
gcloudCLI installed (viaflox install google-cloud-sdkor standalone)- Authenticated:
gcloud auth login - Project set:
gcloud config set project PROJECT_ID - Vertex AI API enabled on the project
Authentication
Vertex AI requires OAuth2, not API keys. Get a bearer token:
ACCESS_TOKEN=$(gcloud auth print-access-token)
Tokens expire after ~60 minutes. Re-run to refresh.
Core Pattern
All Vertex AI calls follow this structure:
PROJECT=$(gcloud config get project)
REGION=us-central1
ACCESS_TOKEN=$(gcloud auth print-access-token)
curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/${MODEL}:${METHOD}" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d "$PAYLOAD"
Gemini — Text Generation
MODEL=gemini-2.0-flash
curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/${MODEL}:generateContent" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"contents": [{"role": "user", "parts": [{"text": "Your prompt here"}]}]
}' | jq -r '.candidates[0].content.parts[0].text'
CRITICAL: Always include "role": "user" in contents — omitting it causes a 400 error.
Available Gemini Models
| Model | Use Case |
|-------|----------|
| gemini-2.0-flash | Fast, general-purpose (recommended default) |
| gemini-2.0-pro | Complex reasoning, longer context |
| gemini-1.5-pro | 1M token context window |
| gemini-1.5-flash | Fast, cost-effective |
Generation Config
{
"contents": [{"role": "user", "parts": [{"text": "prompt"}]}],
"generationConfig": {
"temperature": 0.7,
"topP": 0.95,
"topK": 40,
"maxOutputTokens": 2048,
"candidateCount": 1
}
}
System Instructions
{
"contents": [{"role": "user", "parts": [{"text": "prompt"}]}],
"systemInstruction": {
"parts": [{"text": "You are a helpful coding assistant."}]
}
}
Gemini — Multimodal (Image + Text)
# Base64 encode an image
IMG_B64=$(base64 -i image.png)
curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/gemini-2.0-flash:generateContent" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"contents\": [{\"role\": \"user\", \"parts\": [
{\"text\": \"Describe this image\"},
{\"inlineData\": {\"mimeType\": \"image/png\", \"data\": \"$IMG_B64\"}}
]}]
}" | jq -r '.candidates[0].content.parts[0].text'
Embeddings
curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/text-embedding-005:predict" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"instances": [{"content": "Text to embed"}]
}' | jq '.predictions[0].embeddings.values[:5]'
Imagen — Image Generation
curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/imagen-3.0-generate-002:predict" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"instances": [{"prompt": "A cat wearing a space helmet, digital art"}],
"parameters": {"sampleCount": 1}
}' | jq -r '.predictions[0].bytesBase64Encoded' | base64 -d > output.png
Streaming
For streaming responses, use streamGenerateContent:
curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/gemini-2.0-flash:streamGenerateContent" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{"contents": [{"role": "user", "parts": [{"text": "Write a haiku"}]}]}'
Multi-turn Conversation
{
"contents": [
{"role": "user", "parts": [{"text": "What is the capital of France?"}]},
{"role": "model", "parts": [{"text": "Paris."}]},
{"role": "user", "parts": [{"text": "What is its population?"}]}
]
}
Helper Script
For quick single-prompt calls:
vertex() {
local model="${2:-gemini-2.0-flash}"
local token=$(gcloud auth print-access-token)
local project=$(gcloud config get project 2>/dev/null)
curl -s "https://us-central1-aiplatform.googleapis.com/v1/projects/${project}/locations/us-central1/publishers/google/models/${model}:generateContent" \
-H "Authorization: Bearer $token" \
-H "Content-Type: application/json" \
-d "{\"contents\":[{\"role\":\"user\",\"parts\":[{\"text\":$(echo "$1" | jq -Rs .)}]}]}" \
| jq -r '.candidates[0].content.parts[0].text'
}
# Usage: vertex "explain quantum computing" gemini-2.0-pro
Error Reference
| Error | Cause | Fix |
|-------|-------|-----|
| 401 UNAUTHENTICATED | Token expired or API key used | gcloud auth print-access-token |
| 400 "valid role: user, model" | Missing role field in contents | Add "role": "user" |
| 403 PERMISSION_DENIED | API not enabled or no access | gcloud services enable aiplatform.googleapis.com |
| 429 RESOURCE_EXHAUSTED | Rate limit hit | Back off and retry |
| 404 NOT_FOUND | Wrong model name or region | Check model ID and use us-central1 |
1Password Integration
Store and retrieve credentials securely:
# If using a service account key stored in 1Password:
export GOOGLE_APPLICATION_CREDENTIALS=$(op read "op://VaultName/GCP-SA-Key/credential" | base64 -d > /tmp/sa.json && echo /tmp/sa.json)
gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS