Loki Logs
Query the production logs from Loki, the live log store. Loki itself is not
publicly reachable; the script asks Grafana (grafana.letsrevel.io) to proxy
LogQL to its Loki datasource using a service-account token.
Tool
.venv/bin/python scripts/loki_logs.py [SERVICE] [filters] [--since DUR] [-n LIMIT]
SERVICE is one of web, celery_default, beat, telegram (omit to search all).
Default window is the last 1h, newest first, 100 lines. Run --help for the full flag list.
Use the venv interpreter — the script imports python-decouple.
Auth: reads GRAFANA_TOKEN via decouple (from the project .env or the environment).
If it errors with "GRAFANA_TOKEN is not set", it isn't configured yet — tell the user; do not guess.
What you can filter on
| Kind | Flags | LogQL |
|------|-------|-------|
| Stream labels | SERVICE, --level error (repeatable), --env production | indexed, cheap |
| Line content | --grep TEXT, --exclude TEXT, --regex RE (all repeatable) | substring / regex |
| Request metadata | --trace-id, --request-id, --user-id, --method, --path, --status-code, --ip, --user-agent | structlog fields |
| Raw escape hatch | --query '{service_name="web"} \| status_code="500"' | any LogQL |
Logs are structlog JSON; the displayed line is the event message, with a
metadata line beneath it (suppress with --no-meta, get raw JSON with --json).
Recipes
# Errors in the API in the last hour
.venv/bin/python scripts/loki_logs.py web --level error
# Follow one request end-to-end across every service
.venv/bin/python scripts/loki_logs.py --request-id <uuid> --since 6h --forward
# Celery failures mentioning a traceback, last day
.venv/bin/python scripts/loki_logs.py celery_default --since 1d --grep Traceback
# All 5xx responses, last 2h
.venv/bin/python scripts/loki_logs.py web --status-code 500 --since 2h
# Everything a user triggered
.venv/bin/python scripts/loki_logs.py --user-id <uuid> --since 12h
# Everything from one client IP (e.g. chasing a scraper/429 burst)
.venv/bin/python scripts/loki_logs.py web --ip <ip> --since 6h
# Discover what labels/values exist
.venv/bin/python scripts/loki_logs.py --labels
.venv/bin/python scripts/loki_logs.py --label-values service_name
Workflow when debugging from a symptom
- Start broad:
web --level error --since <window>to find the failure. - Grab the
trace_id/request_idfrom the metadata line. - Re-query by that id with
--forwardto read the full request in order, spanningweb→celery_defaultif work was dispatched. - Use
--print-queryto inspect/borrow the LogQL; use--jsonwhen you need exact timestamps or fields the pretty output omits.
Notes
- Retention is 30 days; queries older than that return nothing.
- A query needs at least one matcher — with no
SERVICE/--levelthe script defaults to{service_name=~".+"}(all app streams). - Hitting
--limitprints a stderr hint; narrow--sinceor raise-n. --ip/--user-agentonly match lines ingested after the Alloy config that promotesip_address/user_agent; older lines won't match those filters.- Read-only. The token is Viewer-scoped; this cannot modify anything.