Instrumentation Planning
"What job is this telemetry helping someone accomplish?"
Every metric/span should answer a job. If you can't name the job, don't add the telemetry.
Framework Selection
| Service Type | Use | |--------------|-----| | HTTP/gRPC APIs | RED (Rate, Errors, Duration) | | Resources (pools, memory) | USE (Utilization, Saturation, Errors) | | Comprehensive SRE | 4 Golden Signals |
Prioritization Tiers
| Tier | What | Examples | |------|------|----------| | T0: Foundation | Must have | Service tags, HTTP middleware, error tracking, health checks | | T1: Performance | Should have | RED per endpoint, DB tracing, external service spans, context propagation | | T2: Resources | Should have | USE for pools, memory/GC, queue depth | | T3: Business | Nice to have | JTBD tags, user tier, feature flags | | T4: Resilience | Nice to have | Circuit breaker state, retry counts, timeouts |
JTBD Context
Link observability to user value:
| Job | Success | Failure | Friction | |-----|---------|---------|----------| | "Place order" | Order created <2s | Order error | Retries >0, duration >5s | | "Process payment" | Payment success | Declined | Gateway timeout |
Anti-Patterns
- Measure everything → Noise, cost, no signal
- High-cardinality labels → User IDs explode storage
- 100% sampling → Unnecessary overhead
- Missing context → Can't debug without job/step
Output
Present prioritized instrumentation plan organized by tier, with specific metrics and implementation order.
References
Load based on need:
references/methodology/red-methodology.mdreferences/methodology/use-methodology.mdreferences/methodology/four-golden-signals.mdreferences/methodology/jtbd-for-backend.md