Agent Skills: Structured Logging

Guide for writing effective log messages using wide events / canonical log lines. Use when writing logging code, adding instrumentation, improving observability, or reviewing log statements. Teaches high-cardinality, high-dimensionality structured logging that enables debugging.

UncategorizedID: Chemiseblanc/ai/structured-logging

Install this agent skill to your local

pnpm dlx add-skill https://github.com/Chemiseblanc/ai/tree/HEAD/plugins/general/skills/structured-logging

Skill Files

Browse the full folder contents for structured-logging.

Download Skill

Loading file tree…

plugins/general/skills/structured-logging/SKILL.md

Skill Metadata

Name
structured-logging
Description
Guide for writing effective log messages using wide events / canonical log lines. Use when writing logging code, adding instrumentation, improving observability, or reviewing log statements. Teaches high-cardinality, high-dimensionality structured logging that enables debugging.

Structured Logging

Core Principle

Emit one wide event per request per service, not scattered log statements.

  • Build context throughout the request lifecycle
  • Emit once at the end (in a finally block)
  • Optimize for querying, not writing
  • Include high-cardinality fields (user_id, request_id) that enable debugging
  • Include high-dimensionality (50+ fields) capturing full context

Anti-patterns

Avoid these common mistakes:

// String concatenation - loses structure
log("User " + userId + " payment failed: " + error)

// Scattered statements - 17 lines for one request
log("Starting request")
log("Validating token")
log("Token valid")
log("Fetching user")
log("User found")
log("Processing payment")
log("Payment failed")

// Low cardinality only - can't debug specific users
log({level: "error", message: "Payment failed"})

// Missing business context - no debugging power
log({user_id: "123", error: "failed"})

Wide Event Structure

One comprehensive event per request:

{
  "timestamp": "2025-01-15T10:23:45.612Z",
  "level": "error",

  "request_id": "req_8bf7ec2d",
  "trace_id": "abc123",
  "span_id": "span_456",

  "service": "checkout-service",
  "version": "2.4.1",
  "deployment_id": "deploy_789",
  "region": "us-east-1",

  "method": "POST",
  "path": "/api/checkout",
  "status_code": 500,
  "duration_ms": 1247,

  "user": {
    "id": "user_456",
    "subscription": "premium",
    "account_age_days": 847,
    "lifetime_value_cents": 284700
  },

  "cart": {
    "id": "cart_xyz",
    "item_count": 3,
    "total_cents": 15999,
    "coupon_applied": "SAVE20"
  },

  "payment": {
    "method": "card",
    "provider": "stripe",
    "latency_ms": 1089,
    "attempt": 3
  },

  "error": {
    "type": "PaymentError",
    "code": "card_declined",
    "message": "Card declined by issuer",
    "retriable": false,
    "stripe_decline_code": "insufficient_funds"
  },

  "feature_flags": {
    "new_checkout_flow": true,
    "express_payment": false
  },

  "timing": {
    "db_query_ms": 45,
    "external_api_ms": 1089,
    "db_queries_count": 3,
    "cache_hit": false
  }
}

For the comprehensive field reference, see references/wide-event-fields.md.

Log Levels

Use log levels to indicate severity and required action:

| Level | When to Use | Examples | |-------|-------------|----------| | ERROR | Request failed, needs investigation | Unhandled exception, 5xx response, data corruption | | WARN | Degraded but recovered | Retry succeeded, fallback used, rate limit approaching, deprecated API called | | INFO | Normal request completion | The canonical log line for successful requests | | DEBUG | Detailed diagnostics | Cache lookups, query plans, intermediate state (usually sampled out in production) |

The wide event's level field reflects the worst outcome of the request:

  • Request succeeded normally: info
  • Request succeeded after retry/fallback: warn
  • Request failed: error

Examples

Bad: Scattered Logs

log.info("Incoming request", {path: "/checkout"})
log.debug("Fetching user")
log.debug("User found", {user_id: "123"})
log.debug("Fetching cart")
log.info("Processing payment")
log.warn("Payment attempt 1 failed, retrying")
log.warn("Payment attempt 2 failed, retrying")
log.error("Payment failed after 3 attempts")

Good: Single Wide Event

log.error({
  request_id: "req_abc",
  path: "/checkout",
  method: "POST",
  status_code: 500,
  duration_ms: 3200,
  user: {id: "123", subscription: "premium"},
  cart: {item_count: 3, total_cents: 15999},
  payment: {
    provider: "stripe",
    attempts: 3,
    latency_ms: 2800
  },
  error: {
    type: "PaymentError",
    code: "card_declined",
    retriable: false
  }
})

Bad: Missing Context

log.error("Database query failed", {error: "timeout"})

Good: Full Context

log.error({
  request_id: "req_xyz",
  user: {id: "user_789", subscription: "enterprise"},
  operation: "fetch_orders",
  database: {
    host: "db-prod-01",
    query_duration_ms: 30000,
    timeout_ms: 30000,
    table: "orders",
    rows_scanned: 1500000
  },
  error: {
    type: "DatabaseError",
    code: "QUERY_TIMEOUT",
    message: "Query exceeded 30s timeout",
    retriable: true,
    suggestion: "Add index on orders.user_id or paginate results"
  }
})

Implementation Pattern

Build the event throughout the request, emit once at the end:

function loggingMiddleware(handler):
    return function(request):
        start_time = now()

        // Initialize wide event
        event = {
            request_id: request.id,
            trace_id: request.trace_id,
            timestamp: start_time,
            method: request.method,
            path: request.path,
            service: SERVICE_NAME,
            version: SERVICE_VERSION
        }

        // Make event available to handlers
        request.wide_event = event

        try:
            response = handler(request)
            event.status_code = response.status
            event.level = "info"
            return response

        catch error:
            event.status_code = 500
            event.level = "error"
            event.error = {
                type: error.name,
                message: error.message,
                code: error.code,
                retriable: error.retriable ?? false
            }
            throw error

        finally:
            event.duration_ms = now() - start_time
            logger.log(event)

In handlers, enrich with business context:

function checkoutHandler(request):
    event = request.wide_event

    // Add user context
    user = getUser(request)
    event.user = {
        id: user.id,
        subscription: user.tier,
        account_age_days: daysSince(user.created_at),
        lifetime_value_cents: user.ltv
    }

    // Add business context
    cart = getCart(user.id)
    event.cart = {
        id: cart.id,
        item_count: cart.items.length,
        total_cents: cart.total
    }

    // Track timing for external calls
    payment_start = now()
    result = processPayment(cart)
    event.payment = {
        provider: "stripe",
        latency_ms: now() - payment_start,
        attempt: result.attempt_number
    }

    return response(result)

Querying Power

Wide events enable queries that scattered logs cannot:

-- Find all checkout failures for premium users with new checkout flow
SELECT * FROM logs
WHERE path = '/checkout'
  AND status_code >= 500
  AND user.subscription = 'premium'
  AND feature_flags.new_checkout_flow = true
  AND timestamp > now() - interval '1 hour'

-- Analyze payment failures by error code and user tier
SELECT
  error.code,
  user.subscription,
  count(*) as failures,
  avg(payment.latency_ms) as avg_latency
FROM logs
WHERE error.type = 'PaymentError'
GROUP BY error.code, user.subscription

-- Find slow requests for enterprise customers
SELECT * FROM logs
WHERE duration_ms > 2000
  AND user.subscription = 'enterprise'
ORDER BY duration_ms DESC