Cloudflare Cost Optimization Skill
Analyze Cloudflare architectures and predict monthly costs with actionable optimization recommendations. This skill provides engineering-grade cost estimates based on 2026 Cloudflare pricing.
Pricing Reference (2026)
Workers
- Requests: $0.30/million (after 10M free)
- CPU Time: $0.02/million GB-seconds
- Unbound: $12.50/million duration-ms (15M free)
- Subrequests: Count against request limits
D1 (SQLite)
- Reads: $0.25/billion rows
- Writes: $1.00/million rows (4x more expensive than reads!)
- Storage: $0.75/GB/month
- Cost Trap:
for(row){db.insert()}= N× writes. Always batch ≤1,000.
R2 (Object Storage)
- Class A (writes): $4.50/million
- Class B (reads): $0.36/million
- Storage: $0.015/GB/month
- Egress: FREE (major advantage)
KV (Key-Value)
- Reads: $0.50/million
- Writes: $5.00/million (10x more than reads!)
- Storage: $0.50/GB/month
- Rate Limit: 1 write/sec/key
Queues
- Standard: $0.40/million messages
- Batch: $0.40/million batches
- Cost Trap:
max_retries: 3= up to 3× message cost
Vectorize
- Queries: $0.01/million
- Stored Vectors: $0.05/100M dimensions×vectors
- Hard Limit: 5M vectors/index, 1,536 dimensions
Workers AI
- Neurons: $0.011/1,000 (reset daily UTC)
- Large Models (Llama 11B+): $0.68/M output tokens - expensive!
- Recommendation: Use smaller models (1B-8B) or Gemini Flash for bulk
AI Gateway
- Caching: Only caches IDENTICAL prompts (no semantic)
- Logs: 10M free, then $0.10/million
- Cost Trap: Forgetting cache = paying full LLM cost every time
Analytics Engine
- Essentially FREE - no per-write charges
- Note: Use
SUM(_sample_interval)at scale (adaptive sampling)
Analysis Workflow
Step 1: Gather Architecture Data
Use MCP tools to collect current usage:
1. Read wrangler.toml/wrangler.jsonc for bindings
2. Query cloudflare-observability for Worker metrics
3. Query cloudflare-ai-gateway for AI costs
4. Check cloudflare-bindings for resource lists
Step 2: Calculate Per-Service Costs
For each service bound in wrangler config:
Workers:
monthly_cost = (requests - 10M) / 1M * $0.30
+ cpu_gb_seconds / 1M * $0.02
D1:
monthly_cost = reads / 1B * $0.25
+ writes / 1M * $1.00
+ storage_gb * $0.75
R2:
monthly_cost = class_a_ops / 1M * $4.50
+ class_b_ops / 1M * $0.36
+ storage_gb * $0.015
KV:
monthly_cost = reads / 1M * $0.50
+ writes / 1M * $5.00
+ storage_gb * $0.50
Queues:
monthly_cost = messages / 1M * $0.40 * (1 + avg_retries)
Step 3: Identify Cost Drivers
Flag any service that's >20% of total cost. Common patterns:
| Cost Driver | Typical Cause | Fix |
|-------------|--------------|-----|
| D1 writes dominating | Per-row inserts | Batch to ≤1,000 |
| Queue costs high | Retries enabled | Set max_retries: 1 if idempotent |
| AI Gateway expensive | No caching | Enable cache, deduplicate prompts |
| Workers AI | Large model | Switch to smaller model or external LLM |
| R2 Class A | Frequent writes | Buffer writes, use R2 presigned |
Step 4: Generate Recommendations
For each optimization opportunity, provide:
- Current: What it costs now
- Optimized: What it could cost
- Savings: Monthly/annual savings
- Trade-off: What changes in behavior
- Implementation: Specific code/config change
Output Format
# Cloudflare Cost Analysis
## Monthly Cost Estimate: $X.XX
### Breakdown by Service
| Service | Cost | % of Total | Status |
|---------|------|------------|--------|
| D1 | $X.XX | X% | ⚠️ Cost driver |
| Workers | $X.XX | X% | ✅ Normal |
| R2 | $X.XX | X% | ✅ Normal |
### Cost Drivers Identified
1. **D1 Writes** (80% of total)
- Current: 50M writes/month = $50
- Pattern detected: Per-row inserts in cron job
- Fix: Batch inserts to ≤1,000 rows
### Optimization Opportunities
| Opportunity | Current | Optimized | Savings | Effort |
|-------------|---------|-----------|---------|--------|
| Batch D1 writes | $50/mo | $5/mo | $45/mo ($540/yr) | Low |
| Reduce queue retries | $10/mo | $3/mo | $7/mo ($84/yr) | Trivial |
### Warnings
- ⚠️ D1 writes >50M/day is a red flag
- ⚠️ Workers AI Llama 11B is expensive for high-volume
### Action Items
1. [ ] Change `for(row){insert()}` to `db.batch()` in `processor.ts`
2. [ ] Set `max_retries: 1` for `layer2-queue` in wrangler.jsonc
3. [ ] Consider switching AI model from llama-3-11b to llama-3-8b
MCP Tools to Use
mcp__cloudflare-observability__query_worker_observability- Worker request/duration metricsmcp__cloudflare-ai-gateway__list_logs- AI request costsmcp__cloudflare-bindings__workers_get_worker- Worker detailsmcp__cloudflare-bindings__d1_databases_list- D1 databasesmcp__cloudflare-bindings__r2_buckets_list- R2 bucketsmcp__cloudflare-bindings__kv_namespaces_list- KV namespaces
Tips
- D1 is usually the culprit: Writes are 4× more expensive than reads
- Queue retries multiply costs: Each retry = another message charge
- Analytics Engine is nearly free: Use it heavily for metrics
- R2 egress is free: Use R2 over S3 when possible
- AI caching only works for identical prompts: Deduplicate inputs
Example Usage
When user asks:
- "How much is this costing me?"
- "Optimize my Cloudflare costs"
- "Why is my D1 bill so high?"
- "Estimate monthly costs for this architecture"
Invoke this skill to provide detailed cost analysis with actionable recommendations.