Perplexity Incident Runbook
Overview
Rapid incident response for Perplexity Sonar API issues. Perplexity-specific: the API depends on live web search, so outages can be partial (search degraded but API responding), model-specific (sonar-pro down but sonar working), or citation-related (answers returned but no sources).
Severity Levels
| Level | Definition | Response Time | Example | |-------|-----------|--------------|---------| | P1 | Complete API failure | < 15 min | All requests returning 500/503 | | P2 | Degraded service | < 1 hour | High latency, 429 rate limits, no citations | | P3 | Minor impact | < 4 hours | Single model unavailable, sporadic errors | | P4 | No user impact | Next business day | Monitoring gap, stale cache |
Quick Triage (Run Immediately)
set -euo pipefail
echo "=== Perplexity Triage ==="
# 1. Test sonar model
echo -n "sonar: "
curl -s -w "HTTP %{http_code} in %{time_total}s" -o /dev/null \
-H "Authorization: Bearer $PERPLEXITY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"sonar","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
https://api.perplexity.ai/chat/completions
echo ""
# 2. Test sonar-pro model
echo -n "sonar-pro: "
curl -s -w "HTTP %{http_code} in %{time_total}s" -o /dev/null \
-H "Authorization: Bearer $PERPLEXITY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"sonar-pro","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
https://api.perplexity.ai/chat/completions
echo ""
# 3. Check API key validity
echo -n "Auth: "
curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: Bearer invalid-key" \
-H "Content-Type: application/json" \
-d '{"model":"sonar","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
https://api.perplexity.ai/chat/completions
echo " (expect 401 = API reachable)"
# 4. DNS check
echo -n "DNS: "
dig +short api.perplexity.ai
Decision Tree
API returning errors?
├─ 401/402: Auth issue
│ └─ Verify API key → Regenerate at perplexity.ai/settings/api
├─ 429: Rate limited
│ └─ Enable request queue → Reduce concurrency → Wait
├─ 500/503: Server error
│ ├─ All models affected?
│ │ ├─ YES → Perplexity outage. Enable fallback/cache.
│ │ └─ NO → Model-specific issue. Route to working model.
│ └─ Check Perplexity community forum for status
├─ Timeout: No response
│ ├─ DNS resolves? → Check network/firewall
│ └─ DNS fails? → DNS issue. Use alternative resolver.
└─ 200 but no citations: Search degraded
└─ Switch to sonar-pro for more citations
Immediate Actions
Auth Failure (401/402)
set -euo pipefail
# Verify current key
echo "Key prefix: ${PERPLEXITY_API_KEY:0:5}"
echo "Key length: ${#PERPLEXITY_API_KEY}"
# If key is invalid: regenerate at perplexity.ai/settings/api
# Update in secret manager:
# gcloud secrets versions add perplexity-api-key --data-file=<(echo -n "NEW_KEY")
# kubectl create secret generic perplexity-secrets --from-literal=api-key=NEW_KEY --dry-run=client -o yaml | kubectl apply -f -
# kubectl rollout restart deployment/your-app
Rate Limited (429)
set -euo pipefail
# Check if we're making too many requests
# Default limit: 50 RPM per API key
# Immediate: reduce concurrency
# kubectl set env deployment/your-app PERPLEXITY_MAX_CONCURRENT=1
# Enable request queuing if not already active
# kubectl set env deployment/your-app PERPLEXITY_QUEUE_MODE=true
Model-Specific Fallback
// If sonar-pro is failing, fall back to sonar
async function resilientSearch(query: string) {
try {
return await perplexity.chat.completions.create({
model: "sonar-pro",
messages: [{ role: "user", content: query }],
});
} catch (err: any) {
if (err.status >= 500) {
console.warn("sonar-pro unavailable, falling back to sonar");
return await perplexity.chat.completions.create({
model: "sonar",
messages: [{ role: "user", content: query }],
});
}
throw err;
}
}
Communication Templates
Internal (Slack)
P[1-4] INCIDENT: Perplexity Search Integration
Status: INVESTIGATING | IDENTIFIED | MONITORING | RESOLVED
Impact: [What users see — degraded search, no citations, etc.]
Cause: [API error / rate limit / auth / Perplexity outage]
Action: [What we're doing]
ETA: [Next update time]
IC: @[name]
Post-Incident
Evidence Collection
set -euo pipefail
# Collect debug bundle
mkdir -p incident-evidence
# API response during incident
curl -s \
-H "Authorization: Bearer $PERPLEXITY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"sonar","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
https://api.perplexity.ai/chat/completions > incident-evidence/api-response.json
# Application logs
kubectl logs -l app=your-app --since=1h > incident-evidence/app-logs.txt 2>/dev/null || true
tar -czf "incident-$(date +%Y%m%d-%H%M%S).tar.gz" incident-evidence/
Postmortem Template
## Incident: Perplexity [Error Type]
**Date:** YYYY-MM-DD | **Duration:** Xh Ym | **Severity:** P[1-4]
### Summary
[1-2 sentences]
### Timeline
- HH:MM — Alert fired: [description]
- HH:MM — Triage: [findings]
- HH:MM — Mitigation: [action taken]
- HH:MM — Resolved
### Root Cause
[Technical explanation — API outage / rate limit / auth / our bug]
### Action Items
- [ ] [Fix] — Owner — Due
Error Handling
| Issue | Cause | Solution | |-------|-------|----------| | All models failing | Perplexity outage | Serve cached results, notify users | | Intermittent 500s | Transient API issue | Retry with backoff | | Latency spike | Complex searches | Timeout + fallback to sonar | | No citations | Search degradation | Log and monitor, usually resolves |
Output
- Issue triaged and categorized
- Remediation applied (fallback/queue/key rotation)
- Stakeholders notified
- Evidence collected for postmortem
Resources
Next Steps
For data handling, see perplexity-data-handling.