Rate Limiting APIs
Overview
Implement sophisticated rate limiting using sliding window, token bucket, and fixed window counter algorithms with Redis-backed distributed state. Configure per-endpoint, per-user, and per-API-key limits with tiered quotas, burst allowances, and standard response headers that communicate limit status to API consumers.
Prerequisites
- Redis 6+ for distributed rate limit state (required for multi-instance deployments)
- Rate limiting library:
rate-limiter-flexible(Node.js),slowapi(Python/FastAPI), or Bucket4j (Java) - API key or user identification mechanism for per-consumer tracking
- Monitoring for rate limit hit rates and rejected request metrics
- Documentation system for publishing rate limit policies to API consumers
Instructions
- Analyze endpoint traffic patterns using Read and Grep on access logs or metrics to determine appropriate rate limits per endpoint category (read-heavy, write-heavy, resource-intensive).
- Select the rate limiting algorithm per endpoint: token bucket for bursty traffic allowance, sliding window log for precise per-second limits, or fixed window counter for simple quota enforcement.
- Implement rate limiting middleware that extracts the client identifier (API key from header, user ID from JWT, or IP address as fallback) and checks against the configured limit.
- Configure tiered rate limits per API consumer plan: Free (100 req/min), Pro (1000 req/min), Enterprise (10000 req/min) with per-endpoint overrides for expensive operations.
- Add burst allowance using token bucket: allow 2x the sustained rate for 10 seconds to handle legitimate traffic spikes without penalizing well-behaved clients.
- Set standard rate limit response headers on every response:
X-RateLimit-Limit,X-RateLimit-Remaining,X-RateLimit-Reset(Unix timestamp), andRateLimit-Policy(draft IETF standard). - Return 429 Too Many Requests with
Retry-Afterheader (seconds until next allowed request) and a JSON body explaining the limit, current usage, and reset time. - Implement rate limit bypass for internal service-to-service calls using shared secret or mutual TLS identification to prevent internal traffic from consuming consumer quotas.
- Write tests that verify rate limits engage at exact thresholds, headers reflect correct remaining counts, and limits reset at the configured window boundary.
See ${CLAUDE_SKILL_DIR}/references/implementation.md for the full implementation guide.
Output
${CLAUDE_SKILL_DIR}/src/middleware/rate-limiter.js- Rate limiting middleware with algorithm selection${CLAUDE_SKILL_DIR}/src/config/rate-limits.js- Per-endpoint and per-tier rate limit configuration${CLAUDE_SKILL_DIR}/src/utils/rate-limit-store.js- Redis-backed distributed counter implementation${CLAUDE_SKILL_DIR}/src/middleware/rate-limit-headers.js- Standard rate limit response header injection${CLAUDE_SKILL_DIR}/tests/rate-limiting/- Rate limit threshold verification tests${CLAUDE_SKILL_DIR}/docs/rate-limits.md- Consumer-facing rate limit documentation
Error Handling
| Error | Cause | Solution |
|-------|-------|----------|
| 429 Too Many Requests | Client exceeded configured rate limit for the endpoint | Return Retry-After header with seconds until reset; include limit details in JSON body |
| Redis connection failure | Rate limit state store unavailable | Fail open (allow requests) or fail closed (reject all) based on security posture; alert immediately |
| Clock skew between instances | Distributed rate limit windows misaligned across servers | Use Redis server time (TIME command) as canonical clock; avoid relying on application server clocks |
| Inconsistent counts | Race condition in read-check-increment cycle | Use Redis MULTI/EXEC transaction or Lua script for atomic increment-and-check operations |
| Bypass abuse | Internal bypass mechanism exploited by external client | Validate bypass credentials per-request; restrict bypass to specific IP ranges or mTLS certificates |
Refer to ${CLAUDE_SKILL_DIR}/references/errors.md for comprehensive error patterns.
Examples
Sliding window with Redis: Implement a sliding window rate limiter using Redis sorted sets, where each request adds a timestamped entry and the window count is computed by ZRANGEBYSCORE over the last 60 seconds.
Tiered SaaS quotas: Free tier gets 100 requests/minute with no burst, Pro tier gets 1000 requests/minute with 2x burst for 10 seconds, Enterprise tier gets 10000 requests/minute with custom per-endpoint overrides.
Login endpoint protection: Apply strict rate limit of 5 attempts per minute per IP on /auth/login to prevent brute force attacks, with progressive lockout (15 min, 1 hour, 24 hours) after repeated violations.
See ${CLAUDE_SKILL_DIR}/references/examples.md for additional examples.
Resources
- IETF RateLimit header fields draft: https://datatracker.ietf.org/doc/draft-ietf-httpapi-ratelimit-headers/
- Token bucket algorithm explained
rate-limiter-flexiblelibrary: https://github.com/animir/node-rate-limiter-flexible- Redis rate limiting patterns with Lua scripts