Agent Skills: Firecrawl Webhooks & Events

|

UncategorizedID: jeremylongshore/claude-code-plugins-plus-skills/firecrawl-webhooks-events

Install this agent skill to your local

pnpm dlx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/HEAD/plugins/saas-packs/firecrawl-pack/skills/firecrawl-webhooks-events

Skill Files

Browse the full folder contents for firecrawl-webhooks-events.

Download Skill

Loading file tree…

plugins/saas-packs/firecrawl-pack/skills/firecrawl-webhooks-events/SKILL.md

Skill Metadata

Name
firecrawl-webhooks-events
Description
|

Firecrawl Webhooks & Events

Overview

Handle Firecrawl webhooks for real-time notifications on async crawl and batch scrape jobs. Instead of polling checkCrawlStatus, configure a webhook URL and Firecrawl will POST events as pages are scraped and jobs complete. Signed with HMAC-SHA256 via X-Firecrawl-Signature.

Webhook Event Types

| Event | Trigger | Payload | |-------|---------|---------| | crawl.started | Crawl job begins | Job ID, config | | crawl.page | Individual page scraped | Page markdown, metadata | | crawl.completed | Full crawl finishes | All pages array | | crawl.failed | Crawl job errors | Error message | | batch_scrape.completed | Batch scrape finishes | All scraped pages |

Instructions

Step 1: Start Crawl with Webhook

import FirecrawlApp from "@mendable/firecrawl-js";

const firecrawl = new FirecrawlApp({
  apiKey: process.env.FIRECRAWL_API_KEY!,
});

// Webhook as string (simple)
const job = await firecrawl.asyncCrawlUrl("https://docs.example.com", {
  limit: 100,
  scrapeOptions: { formats: ["markdown"] },
  webhook: "https://api.yourapp.com/webhooks/firecrawl",
});

console.log(`Crawl started: ${job.id}`);

// Webhook as object (with metadata and event filtering)
const job2 = await firecrawl.asyncCrawlUrl("https://docs.example.com", {
  limit: 100,
  scrapeOptions: { formats: ["markdown"] },
  webhook: {
    url: "https://api.yourapp.com/webhooks/firecrawl",
    events: ["completed", "page"],  // only these events
    metadata: {
      projectId: "my-project",
      triggeredBy: "cron",
    },
  },
});

Step 2: Webhook Handler with Signature Verification

import express from "express";
import crypto from "crypto";

const app = express();
app.use(express.json());

function verifySignature(body: string, signature: string): boolean {
  if (!process.env.FIRECRAWL_WEBHOOK_SECRET) return true; // skip if not configured
  const expected = crypto
    .createHmac("sha256", process.env.FIRECRAWL_WEBHOOK_SECRET)
    .update(body)
    .digest("hex");
  return crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected));
}

app.post("/webhooks/firecrawl", express.raw({ type: "application/json" }), async (req, res) => {
  const rawBody = req.body.toString();
  const signature = req.headers["x-firecrawl-signature"] as string;

  if (!verifySignature(rawBody, signature)) {
    return res.status(401).json({ error: "Invalid signature" });
  }

  const { type, id, data, metadata } = JSON.parse(rawBody);

  // Respond immediately — process asynchronously
  res.status(200).json({ received: true });

  switch (type) {
    case "crawl.started":
      console.log(`Crawl ${id} started`);
      break;
    case "crawl.page":
      await handlePageScraped(id, data, metadata);
      break;
    case "crawl.completed":
      await handleCrawlComplete(id, data, metadata);
      break;
    case "crawl.failed":
      await handleCrawlFailed(id, data);
      break;
  }
});

Step 3: Process Page Events (Streaming)

async function handlePageScraped(jobId: string, data: any[], metadata: any) {
  for (const page of data) {
    const doc = {
      url: page.metadata?.sourceURL,
      title: page.metadata?.title,
      markdown: page.markdown,
      statusCode: page.metadata?.statusCode,
      crawlJobId: jobId,
      projectId: metadata?.projectId,
      indexedAt: new Date(),
    };

    // Index page immediately — don't wait for full crawl
    await documentStore.upsert(doc);
    console.log(`Indexed: ${doc.url} (${doc.markdown?.length || 0} chars)`);
  }
}

Step 4: Handle Crawl Completion

async function handleCrawlComplete(jobId: string, data: any[], metadata: any) {
  console.log(`Crawl ${jobId} complete: ${data.length} pages`);

  // Build search index from all crawled pages
  const documents = data
    .filter(page => page.markdown && page.markdown.length > 100)
    .map(page => ({
      id: page.metadata?.sourceURL,
      title: page.metadata?.title || "",
      content: page.markdown,
      url: page.metadata?.sourceURL,
    }));

  await searchIndex.indexBatch(documents);
  console.log(`Indexed ${documents.length} documents for project ${metadata?.projectId}`);
}

async function handleCrawlFailed(jobId: string, data: any) {
  console.error(`Crawl ${jobId} failed:`, data.error);

  await alerting.send({
    severity: "high",
    message: `Firecrawl crawl job ${jobId} failed`,
    error: data.error,
    partialResults: data.partialResults?.length || 0,
  });
}

Step 5: Polling as Webhook Fallback

// Fall back to polling if webhook delivery fails
async function pollWithFallback(jobId: string, timeoutMs = 600000) {
  const deadline = Date.now() + timeoutMs;
  let interval = 2000;

  while (Date.now() < deadline) {
    const status = await firecrawl.checkCrawlStatus(jobId);

    if (status.status === "completed") {
      return status.data;
    }
    if (status.status === "failed") {
      throw new Error(`Crawl failed: ${status.error}`);
    }

    console.log(`Polling: ${status.completed}/${status.total} pages`);
    await new Promise(r => setTimeout(r, interval));
    interval = Math.min(interval * 1.5, 30000);
  }

  throw new Error(`Crawl timed out after ${timeoutMs}ms`);
}

Error Handling

| Issue | Cause | Solution | |-------|-------|----------| | Webhook not received | URL not publicly accessible | Use ngrok for local dev, verify HTTPS | | Signature mismatch | Wrong secret or body encoding | Use raw body for HMAC, not parsed JSON | | Duplicate events | Firecrawl retry on non-2xx | Make handler idempotent (dedup by job ID) | | Webhook timeout | Processing takes too long | Return 200 immediately, process async | | Lost events | 3 failed retries | Implement polling fallback |

Examples

Local Development with ngrok

set -euo pipefail
# Start ngrok tunnel for local webhook testing
ngrok http 3000
# Use the ngrok URL as your webhook endpoint
# https://abc123.ngrok.io/webhooks/firecrawl

Resources

Next Steps

For deployment setup, see firecrawl-deploy-integration.