Agent Skills: Anthropic Production Checklist

|

UncategorizedID: jeremylongshore/claude-code-plugins-plus-skills/anth-prod-checklist

Install this agent skill to your local

pnpm dlx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/HEAD/plugins/saas-packs/anthropic-pack/skills/anth-prod-checklist

Skill Files

Browse the full folder contents for anth-prod-checklist.

Download Skill

Loading file tree…

plugins/saas-packs/anthropic-pack/skills/anth-prod-checklist/SKILL.md

Skill Metadata

Name
anth-prod-checklist
Description
|

Anthropic Production Checklist

Overview

Complete checklist for deploying Claude API integrations to production with reliability, observability, and cost controls.

Pre-Launch Checklist

Authentication & Keys

  • [ ] Production API key from dedicated Workspace
  • [ ] Key stored in secret manager (not env files on servers)
  • [ ] Key rotation procedure documented and tested
  • [ ] Separate keys for each environment (dev/staging/prod)

Error Handling

  • [ ] All 5 error types handled: authentication_error, invalid_request_error, rate_limit_error, api_error, overloaded_error
  • [ ] SDK maxRetries set (recommended: 3-5 for production)
  • [ ] Custom error logging with request-id captured
  • [ ] Circuit breaker for sustained API failures

Rate Limits & Cost

  • [ ] Usage tier verified at console.anthropic.com
  • [ ] Application-level rate limiting implemented
  • [ ] Cost alerts configured (monthly spend caps)
  • [ ] Model selection optimized (Haiku for simple tasks, Sonnet for complex)
  • [ ] max_tokens set to realistic values (not inflated)
  • [ ] Prompt caching enabled for repeated system prompts

Reliability

  • [ ] Timeout configured (timeout parameter, recommended 60-120s)
  • [ ] Graceful degradation when API is unavailable
  • [ ] Health check endpoint tests API connectivity
async def health_check():
    try:
        # Use token counting as a cheap health probe (no generation cost)
        count = client.messages.count_tokens(
            model="claude-haiku-4-20250514",
            messages=[{"role": "user", "content": "ping"}]
        )
        return {"status": "healthy", "tokens": count.input_tokens}
    except Exception as e:
        return {"status": "degraded", "error": str(e)}

Observability

  • [ ] Request/response logging (redact content, keep metadata)
  • [ ] Latency tracking (p50, p95, p99)
  • [ ] Token usage tracking (input + output per request)
  • [ ] Cost tracking per feature/customer
  • [ ] Error rate alerting (429s, 5xx, timeouts)
import logging
import time

logger = logging.getLogger("anthropic")

def tracked_create(**kwargs):
    start = time.monotonic()
    try:
        response = client.messages.create(**kwargs)
        duration = time.monotonic() - start
        logger.info(
            "claude_request",
            extra={
                "request_id": response._request_id,
                "model": response.model,
                "input_tokens": response.usage.input_tokens,
                "output_tokens": response.usage.output_tokens,
                "duration_ms": int(duration * 1000),
                "stop_reason": response.stop_reason,
            }
        )
        return response
    except Exception as e:
        duration = time.monotonic() - start
        logger.error("claude_error", extra={"error": str(e), "duration_ms": int(duration * 1000)})
        raise

Content Safety

  • [ ] System prompts reviewed for injection resistance
  • [ ] User input validated and length-limited
  • [ ] Output scanned for sensitive data leakage
  • [ ] Content moderation for user-facing responses

Infrastructure

  • [ ] Deployment uses canary/rolling strategy
  • [ ] Rollback procedure documented and tested
  • [ ] Runbook created (see anth-incident-runbook)
  • [ ] On-call escalation path defined

Alerting Thresholds

| Metric | Warning | Critical | |--------|---------|----------| | Error rate (5xx) | > 1% | > 5% | | p99 latency | > 10s | > 30s | | 429 rate | > 5/min | > 20/min | | Daily cost | > 80% budget | > 100% budget | | Auth failures (401/403) | > 0 | > 0 (immediate) |

Resources

Next Steps

For version upgrades, see anth-upgrade-migration.