Mimir + Prometheus Troubleshooting & Query-Builder Skill Skill

Mimir + Prometheus Troubleshooting & Query-Builder Skill

What this Skill does

Use this skill whenever a user needs help with:

PromQL queries
Metric debugging
Missing data / gaps
Cardinality optimization
Aggregation strategy
Recording rules

Best Practices

Low-cardinality label selection

Use labels such as:

job, instance, service, cluster, namespace, env

Avoid:

user_id, session_id, request_id, raw UUIDs

Always narrow time ranges

Prefer "5m", "15m", "1h".

Use correct aggregations

rate() for counters
sum by (...) for grouping
histogram_quantile() for latency

Suggest recording rules if query is heavy

Example Queries

| User Request | PromQL | |------------------------------------|---------------------------------------------------------------------------------------------------------------| | "Error rate for payments in prod" | sum by (job) (rate(http_requests_total{job="payments", env="prod", status=~"5.."}[5m])) | | "Latency p95 for frontend" | histogram_quantile(0.95, sum by (le) (rate(http_request_duration_seconds_bucket{app="frontend"}[5m]))) |

When to Suggest Loki or Tempo

For:

request IDs
root-cause event-level debugging
full request paths

→ Recommend Tempo + Loki correlations.

Limitations

Skill does not run PromQL

Agent Skills: Mimir + Prometheus Troubleshooting & Query-Builder Skill

Install this agent skill to your local

Skill Files