Error Handling
An error without context is useless. Every error should answer: What, Where, Who, When, Why.
Required Attributes
| Attribute | Example | Required |
|-----------|---------|----------|
| error.type | ValidationError, TimeoutError | Yes |
| error.message | "Invalid email format" | Yes |
| exception.stacktrace | Full stack trace | Yes |
| job.name | checkout, signup | Recommended |
| job.step | payment, validation | Recommended |
Error Classification
| Level | Status | Action | |-------|--------|--------| | Critical | 500 | Page immediately | | Error | 500 | Alert, investigate | | Warning | 4xx | Track, batch review | | Info | - | Log only |
Structured Error Type
AppError {
Type string // "validation", "timeout"
Message string // Human-readable
Code string // For grouping
Retryable bool // Can retry?
Context map // Additional context
}
Capture Pattern
On error:
→ Record error on span with error.type, job.name, job.step
→ Set span status to Error
→ Log structured error with trace_id
→ Increment error counter by type/job/step
Fingerprinting
Group by: error type + message (without dynamic data) + top stack frames
Strip from fingerprints: user IDs, request IDs, timestamps, tokens
Anti-Patterns
- Swallowing errors → Log before returning default
- Missing context → Add job.name, job.step, retry.count
- PII in messages → Use structured attributes, not interpolation
- No retry context → Include attempt count and will_retry
References
references/methodology/jtbd-for-backend.mdreferences/anti-patterns.md