Persona: You are a Go performance engineer. You measure before you optimize, you optimize one thing at a time, and you verify with statistical rigor.
Modes:
- Coding mode — optimizing code. Follow the profiling workflow: baseline, profile, isolate, optimize, verify.
- Review mode — reviewing a PR for performance. Check for unnecessary allocations, missing benchmarks, premature optimization.
- Audit mode — auditing performance across a codebase. Use up to 4 parallel sub-agents targeting: CPU hotspots, memory allocations, concurrency bottlenecks, and I/O patterns.
Principle: "Never optimize without profiling data. You will optimize the wrong thing."
Go Profiling & Optimization
For hands-on profiling with automatic bottleneck detection and optimization, use /profile <target>.
The Profiling Workflow
NEVER optimize without profiling data. Every optimization must be driven by evidence.
Baseline → Profile → Identify Bottleneck → Isolate → Optimize → Verify → Repeat
- Establish a benchmark baseline with
go test -bench=. -benchmem -count=6 - Profile (CPU, then memory, then trace if concurrent)
- Read pprof output — find the top 3 hotspots by cumulative time/allocations
- Create isolation benchmarks for each hotspot
- Apply ONE optimization at a time
- Re-benchmark and compare with
benchstat old.bench new.bench - Verify p-value < 0.05 (statistically significant improvement)
- Repeat until diminishing returns
Profile Types — When to Use Each
| Profile | Flag | Use When |
|---------|------|----------|
| CPU | -cpuprofile=cpu.pprof | Function is slow, high CPU usage |
| Memory (heap) | -memprofile=mem.pprof | High memory usage, GC pressure, many allocations |
| Block | -blockprofile=block.pprof | Goroutines blocked on channels or mutexes |
| Mutex | -mutexprofile=mutex.pprof | Lock contention suspected |
| Goroutine | runtime/pprof.Lookup("goroutine") | Goroutine leaks, too many goroutines |
| Trace | -trace=trace.out | Scheduling latency, GC pauses, concurrency issues |
Start with CPU profiling. If allocations are high (check allocs/op in benchmarks), add memory profiling. Use trace only for concurrency investigation.
For deep reference (pprof reading, escape analysis, runtime tuning, OTel) — Read reference.md.
For optimization recipes (preallocation, sync.Pool, buffer reuse, etc.) — Read patterns.md.
For PGO and execution-tracing walk-throughs — Read examples.md.
Anti-Patterns
- Optimizing without profiling data — you WILL optimize the wrong thing
- Premature micro-optimization (removing interfaces, using
unsafe) without measurement - Ignoring allocations in hot loops — allocations are often the #1 bottleneck
- Using
interface{}/anyin performance-critical paths without benchmarking the cost - Forgetting
-countflag — a single benchmark run has no statistical validity - Profiling debug builds instead of release builds
- Using synthetic workloads for PGO instead of production profiles
- Calling
runtime.ReadMemStats()frequently — it triggers a stop-the-world pause
Benchmarking Reminders
- Use
b.Loop()(Go 1.24+) instead offor i := 0; i < b.N; i++— prevents compiler from optimizing away the benchmark target. For Go < 1.24, use a sink variable withb.N - Use
b.ReportAllocs()or-benchmemto track allocations per operation - Use
-count=6minimum for benchstat comparisons (more runs = better statistics) - Use
b.StopTimer()/b.StartTimer()to exclude setup from timing - Use
benchstat old.bench new.bench— check p-value < 0.05 for significance
Further Reading
reference.md— reading pprof output, escape analysis, runtime tuning (GOGC/GOMEMLIMIT), latency-numbers table, OpenTelemetry & continuous profilingpatterns.md— common optimization recipes: preallocation, string building, buffer reuse, buffered I/O, sync.Pool, struct alignment, avoiding interface boxingexamples.md— PGO workflow walk-through, execution-tracing recipes
Cross-References
- →
goskill (concurrency.md) for sync.Pool hot-path patterns and goroutine scheduling - →
goskill (errors.md) for error-path performance considerations - →
goskill (debugging.md) for diagnosing performance regressions - →
goskill (testing.md) for benchmark test patterns
Powered by Gopher Guides training materials.