Agent Skills: Firecrawl Incident Runbook

|

UncategorizedID: jeremylongshore/claude-code-plugins-plus-skills/firecrawl-incident-runbook

Install this agent skill to your local

pnpm dlx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/HEAD/plugins/saas-packs/firecrawl-pack/skills/firecrawl-incident-runbook

Skill Files

Browse the full folder contents for firecrawl-incident-runbook.

Download Skill

Loading file tree…

plugins/saas-packs/firecrawl-pack/skills/firecrawl-incident-runbook/SKILL.md

Skill Metadata

Name
firecrawl-incident-runbook
Description
'Execute Firecrawl incident response procedures with triage, mitigation,

Firecrawl Incident Runbook

Overview

Rapid incident response procedures for Firecrawl integration failures. Covers API outage triage, credential issues, credit exhaustion, crawl job failures, and webhook delivery problems.

Severity Levels

| Level | Definition | Response Time | Examples | |-------|------------|---------------|----------| | P1 | Complete failure | < 15 min | API returns 401/500 on all requests | | P2 | Degraded service | < 1 hour | High latency, partial failures, 429s | | P3 | Minor impact | < 4 hours | Webhook delays, some empty scrapes | | P4 | No user impact | Next business day | Monitoring gaps, credit warnings |

Quick Triage (Run First)

set -euo pipefail
# 1. Test Firecrawl API directly
echo "=== API Health ==="
curl -s -w "\nHTTP %{http_code}\n" https://api.firecrawl.dev/v1/scrape \
  -H "Authorization: Bearer $FIRECRAWL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com","formats":["markdown"]}' | jq '{success, error}'

# 2. Check credit balance
echo "=== Credits ==="
curl -s https://api.firecrawl.dev/v1/team/credits \
  -H "Authorization: Bearer $FIRECRAWL_API_KEY" | jq .

# 3. Check our app health
echo "=== App Health ==="
curl -sf https://api.yourapp.com/health | jq '.services.firecrawl' || echo "App unhealthy"

Decision Tree

Firecrawl API returning errors?
├─ 401: API key invalid
│   → Verify key at firecrawl.dev/app, rotate if needed
├─ 402: Credits exhausted
│   → Upgrade plan or wait for monthly reset
├─ 429: Rate limited
│   → Reduce concurrency, enable backoff, check Retry-After
├─ 500/503: Firecrawl outage
│   → Enable fallback mode, monitor firecrawl.dev status
└─ API working fine
    └─ Our integration issue
        ├─ Empty markdown → Increase waitFor, check target site
        ├─ Crawl stuck → Check job status, enforce timeout
        └─ Webhook not firing → Verify endpoint, check signature

Immediate Actions by Error Type

401 — Authentication Failure

set -euo pipefail
# Verify current key
echo "Key prefix: ${FIRECRAWL_API_KEY:0:5}"
echo "Key length: ${#FIRECRAWL_API_KEY}"

# Test with explicit key
curl -s https://api.firecrawl.dev/v1/scrape \
  -H "Authorization: Bearer $FIRECRAWL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com","formats":["markdown"]}' | jq .success

# If fails: regenerate key at firecrawl.dev/app and update all environments

402 — Credits Exhausted

set -euo pipefail
# Check balance
curl -s https://api.firecrawl.dev/v1/team/credits \
  -H "Authorization: Bearer $FIRECRAWL_API_KEY" | jq .

# Immediate: disable non-critical scraping
# Long-term: upgrade plan or implement credit budget

429 — Rate Limited

// Enable emergency rate limiting
const EMERGENCY_DELAY_MS = 5000; // 5s between requests

async function emergencyScrape(url: string) {
  await new Promise(r => setTimeout(r, EMERGENCY_DELAY_MS));
  return firecrawl.scrapeUrl(url, { formats: ["markdown"] });
}

500/503 — Firecrawl Outage

// Enable graceful degradation
async function scrapeWithFallback(url: string) {
  try {
    return await firecrawl.scrapeUrl(url, { formats: ["markdown"] });
  } catch (error: any) {
    if (error.statusCode >= 500) {
      console.error("Firecrawl unavailable — using cached content");
      return getCachedContent(url); // serve stale data
    }
    throw error;
  }
}

Communication Templates

Internal (Slack)

P[1-4] INCIDENT: Firecrawl Integration
Status: INVESTIGATING
Impact: [Describe user-facing impact]
Error: [401/402/429/500] — [brief description]
Action: [What you're doing right now]
Next update: [time]

Post-Incident

Evidence Collection

set -euo pipefail
# Collect debug bundle
mkdir -p incident-$(date +%Y%m%d)
curl -s https://api.firecrawl.dev/v1/team/credits \
  -H "Authorization: Bearer $FIRECRAWL_API_KEY" > incident-$(date +%Y%m%d)/credits.json

# Application logs
kubectl logs -l app=my-app --since=1h | grep -i firecrawl > incident-$(date +%Y%m%d)/logs.txt 2>/dev/null || true

Postmortem Template

## Incident: Firecrawl [Error Type]
Date: YYYY-MM-DD | Duration: X hours | Severity: P[1-4]

### Summary
[1-2 sentence description]

### Timeline
- HH:MM — [First alert]
- HH:MM — [Investigation started]
- HH:MM — [Root cause identified]
- HH:MM — [Resolved]

### Root Cause
[Technical explanation]

### Action Items
- [ ] [Preventive measure] — Owner — Due date

Error Handling

| Issue | Cause | Solution | |-------|-------|----------| | Can't reach Firecrawl API | Network/DNS issue | Try from different network, check DNS | | All scrapes return empty | Target site changed | Verify manually, adjust scrape options | | Crawl jobs never complete | Queue backup | Cancel stuck jobs, reduce concurrency | | Webhook endpoint unreachable | Deployment issue | Check HTTPS cert, DNS, firewall |

Resources

Next Steps

For data handling, see firecrawl-data-handling.