Agent Skills: Databricks SQL performance review

Review and improve Databricks SQL queries for correctness, readability, and performance (joins, filters, aggregations, partition pruning). Use when someone pastes a SQL query, asks why it is slow, or requests a rewrite/optimization in Databricks SQL.

UncategorizedID: hubert-dudek/medium/sql-performance-review

Install this agent skill to your local

pnpm dlx add-skill https://github.com/hubert-dudek/medium/tree/HEAD/news/202602/skills/sql-performance-review

Skill Files

Browse the full folder contents for sql-performance-review.

Download Skill

Loading file tree…

news/202602/skills/sql-performance-review/SKILL.md

Skill Metadata

Name
sql-performance-review
Description
Review and improve Databricks SQL queries for correctness, readability, and performance (joins, filters, aggregations, partition pruning). Use when someone pastes a SQL query, asks why it is slow, or requests a rewrite/optimization in Databricks SQL.

Databricks SQL performance review

Use this skill when optimizing or reviewing SQL in Databricks SQL.

What to ask for (only if missing)

Ask up to 3 questions total:

  1. The query text (if not provided)
  2. The table(s) involved + their sizes (rough order of magnitude) OR the query profile / execution plan
  3. The desired result constraints (correctness, exactness, latency SLA)

If the user can’t provide sizes/plan, proceed with best-effort heuristics and call out assumptions.

Output format

Use the structure in assets/sql-review-output.md.

Checklist

Use references/sql-checklist.md to ensure you cover the common performance levers:

  • predicate pushdown / partition pruning
  • join strategy and join keys
  • avoid SELECT *
  • minimize shuffles / wide aggregations
  • use correct data types and avoid implicit casts
  • reduce data scanned (pre-filter, semi-joins, EXISTS)

Examples

User: “This query is slow in Databricks SQL. Can you optimize it?” (pastes query)
Assistant: Provide issues, suggestions, and a rewritten query, plus next steps (EXPLAIN, add ZORDER, etc.).

Edge cases

  • If the query is logically wrong (duplicates from joins, missing filters), fix correctness first.
  • If tables are Delta: suggest partitioning/ZORDER/OPTIMIZE only if it matches query patterns.
  • If the user is in a governed environment: avoid suggestions that require elevated permissions unless noted.