Agent Skills: Perplexity Reliability Patterns

|

UncategorizedID: jeremylongshore/claude-code-plugins-plus-skills/perplexity-reliability-patterns

Install this agent skill to your local

pnpm dlx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/HEAD/plugins/saas-packs/perplexity-pack/skills/perplexity-reliability-patterns

Skill Files

Browse the full folder contents for perplexity-reliability-patterns.

Download Skill

Loading file tree…

plugins/saas-packs/perplexity-pack/skills/perplexity-reliability-patterns/SKILL.md

Skill Metadata

Name
perplexity-reliability-patterns
Description
|

Perplexity Reliability Patterns

Overview

Production reliability patterns for Perplexity Sonar API. Perplexity performs live web searches per request, making response times inherently variable. The key reliability challenges: search can stall, citations can break, and model tiers have different availability.

Prerequisites

  • Perplexity API key configured
  • Cache layer (Redis or in-memory)
  • Understanding of search latency variability

Instructions

Step 1: Model Tier Fallback

import OpenAI from "openai";

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY!,
  baseURL: "https://api.perplexity.ai",
});

async function resilientSearch(
  query: string,
  preferredModel: string = "sonar-pro"
) {
  const fallbackChain = [preferredModel, "sonar"];
  let lastError: Error | null = null;

  for (const model of fallbackChain) {
    try {
      const response = await perplexity.chat.completions.create({
        model,
        messages: [{ role: "user", content: query }],
        max_tokens: model === "sonar-pro" ? 2048 : 512,
      });

      if (model !== preferredModel) {
        console.warn(`[Reliability] Fell back from ${preferredModel} to ${model}`);
      }

      return {
        answer: response.choices[0].message.content || "",
        citations: (response as any).citations || [],
        model: response.model,
        fallback: model !== preferredModel,
      };
    } catch (err: any) {
      lastError = err;
      if (err.status === 401 || err.status === 402) throw err; // Don't retry auth/billing
      console.warn(`[Reliability] ${model} failed (${err.status || err.message}), trying next`);
    }
  }

  throw lastError || new Error("All models failed");
}

Step 2: Circuit Breaker

class CircuitBreaker {
  private failures = 0;
  private lastFailure = 0;
  private state: "closed" | "open" | "half-open" = "closed";

  constructor(
    private threshold: number = 5,
    private resetTimeMs: number = 60000
  ) {}

  async execute<T>(fn: () => Promise<T>, fallback: () => Promise<T>): Promise<T> {
    if (this.state === "open") {
      if (Date.now() - this.lastFailure > this.resetTimeMs) {
        this.state = "half-open";
      } else {
        console.warn("[CircuitBreaker] Open — using fallback");
        return fallback();
      }
    }

    try {
      const result = await fn();
      if (this.state === "half-open") {
        this.state = "closed";
        this.failures = 0;
      }
      return result;
    } catch (err) {
      this.failures++;
      this.lastFailure = Date.now();
      if (this.failures >= this.threshold) {
        this.state = "open";
        console.warn(`[CircuitBreaker] Opened after ${this.failures} failures`);
      }
      return fallback();
    }
  }

  get status() {
    return { state: this.state, failures: this.failures };
  }
}

// Usage
const breaker = new CircuitBreaker(5, 60000);
const cachedFallback = () => getCachedResult(query);

const result = await breaker.execute(
  () => resilientSearch(query, "sonar-pro"),
  cachedFallback
);

Step 3: Streaming with Timeout Protection

async function* streamWithTimeout(
  query: string,
  model: string = "sonar",
  chunkTimeoutMs: number = 10000
): AsyncGenerator<{ type: "text" | "citations" | "timeout"; data: any }> {
  const stream = await perplexity.chat.completions.create({
    model,
    messages: [{ role: "user", content: query }],
    stream: true,
    max_tokens: 2048,
  });

  let lastChunkAt = Date.now();

  for await (const chunk of stream) {
    if (Date.now() - lastChunkAt > chunkTimeoutMs) {
      yield { type: "timeout", data: "Stream stalled — no data for 10s" };
      return;
    }

    lastChunkAt = Date.now();
    const text = chunk.choices[0]?.delta?.content || "";
    if (text) yield { type: "text", data: text };

    const citations = (chunk as any).citations;
    if (citations) yield { type: "citations", data: citations };
  }
}

// Usage
for await (const event of streamWithTimeout("explain quantum computing", "sonar-pro")) {
  if (event.type === "text") process.stdout.write(event.data);
  if (event.type === "citations") console.log("\nSources:", event.data);
  if (event.type === "timeout") console.error("\nStream timed out");
}

Step 4: Cache as Reliability Layer

import { LRUCache } from "lru-cache";
import { createHash } from "crypto";

const reliabilityCache = new LRUCache<string, any>({
  max: 500,
  ttl: 24 * 3600_000, // 24-hour stale cache for reliability
});

async function searchWithCacheFallback(query: string, model = "sonar") {
  const key = createHash("sha256").update(`${model}:${query}`).digest("hex");

  try {
    const response = await resilientSearch(query, model);
    // Update cache on success
    reliabilityCache.set(key, response);
    return { ...response, source: "live" };
  } catch {
    // Serve stale cache as last resort
    const cached = reliabilityCache.get(key);
    if (cached) {
      console.warn("[Reliability] Serving stale cached result");
      return { ...cached, source: "stale-cache" };
    }
    throw new Error("Perplexity unavailable and no cached result");
  }
}

Step 5: Citation URL Validation

async function validateCitations(
  citations: string[],
  timeoutMs: number = 5000
): Promise<Array<{ url: string; status: number; valid: boolean }>> {
  const results = await Promise.allSettled(
    citations.slice(0, 5).map(async (url) => {
      const controller = new AbortController();
      const timeout = setTimeout(() => controller.abort(), timeoutMs);
      try {
        const response = await fetch(url, {
          method: "HEAD",
          signal: controller.signal,
          redirect: "follow",
        });
        return { url, status: response.status, valid: response.status < 400 };
      } catch {
        return { url, status: 0, valid: false };
      } finally {
        clearTimeout(timeout);
      }
    })
  );

  return results.map((r) =>
    r.status === "fulfilled" ? r.value : { url: "", status: 0, valid: false }
  );
}

Error Handling

| Issue | Cause | Solution | |-------|-------|----------| | sonar-pro timeout >15s | Complex multi-source search | Fall back to sonar | | Stream stalls | Search hanging on source | Per-chunk timeout detection | | Broken citation links | Source pages moved/deleted | Validate URLs before displaying | | All models failing | Perplexity outage | Serve stale cache, circuit breaker |

Output

  • Model tier fallback chain
  • Circuit breaker preventing cascade failures
  • Streaming with stall detection
  • Cache as reliability layer (stale > unavailable)
  • Citation URL validation

Resources

Next Steps

For policy enforcement, see perplexity-policy-guardrails.