Agent Skills: 🐘 PostgreSQL Optimization

Unconventional PostgreSQL optimization techniques

UncategorizedID: simhacker/moollm/postgres-optimization

Install this agent skill to your local

pnpm dlx add-skill https://github.com/SimHacker/moollm/tree/HEAD/skills/postgres-optimization

Skill Files

Browse the full folder contents for postgres-optimization.

Download Skill

Loading file tree…

skills/postgres-optimization/SKILL.md

Skill Metadata

Name
postgres-optimization
Description
"Table to analyze"

🐘 PostgreSQL Optimization

"Beyond 'just add an index' β€” creative solutions for real performance problems."

Unconventional optimization techniques for PostgreSQL that go beyond standard DBA playbooks.

Purpose

When conventional approaches fall short β€” query rewrites, adding indexes, VACUUM, ANALYZE β€” these techniques offer creative solutions:

  • Eliminate impossible query scans with constraint exclusion
  • Reduce index size with function-based indexes
  • Enforce uniqueness with hash indexes instead of B-Trees

When to Use

  • Ad-hoc query environments where users make mistakes
  • Large indexes approaching table size
  • Uniqueness constraints on large text values (URLs, documents)
  • Timestamp columns queried at coarser granularity

Technique 1: Constraint Exclusion

The Problem

Check constraints prevent invalid data, but PostgreSQL doesn't use them to optimize queries by default.

CREATE TABLE users (
    id INT PRIMARY KEY,
    username TEXT NOT NULL,
    plan TEXT NOT NULL,
    CONSTRAINT plan_check CHECK (plan IN ('free', 'pro'))
);

An analyst writes:

SELECT * FROM users WHERE plan = 'Pro';  -- Note: capital P

Despite the check constraint making this condition impossible, PostgreSQL scans the entire table.

The Solution

SET constraint_exclusion TO 'on';

With constraint exclusion enabled:

EXPLAIN ANALYZE SELECT * FROM users WHERE plan = 'Pro';
Result  (cost=0.00..0.00 rows=0 width=0)
  One-Time Filter: false
Execution Time: 0.008 ms

PostgreSQL recognizes the condition contradicts the constraint and skips the scan entirely.

When to Enable

| Environment | Recommendation | |-------------|----------------| | OLTP production | Leave as 'partition' (default) | | BI / Data Warehouse | Set to 'on' | | Ad-hoc query tools | Set to 'on' | | Reporting databases | Set to 'on' |

Tradeoffs

  • Benefit: Eliminates impossible query scans
  • Cost: Extra planning overhead evaluating constraints against conditions
  • Default: 'partition' β€” only used for partition pruning

Technique 2: Function-Based Indexes for Lower Cardinality

The Problem

You have a sales table with timestamps:

CREATE TABLE sale (
    id INT PRIMARY KEY,
    sold_at TIMESTAMPTZ NOT NULL,
    charged INT NOT NULL
);

Analysts query by day:

SELECT date_trunc('day', sold_at AT TIME ZONE 'UTC'), SUM(charged)
FROM sale
WHERE sold_at BETWEEN '2025-01-01 UTC' AND '2025-02-01 UTC'
GROUP BY 1;

You add a B-Tree index on sold_at β€” 214 MB for a 160 MB table. The index is almost half the table size!

The Solution

Index only what queries need:

CREATE INDEX sale_sold_at_date_ix 
ON sale((date_trunc('day', sold_at AT TIME ZONE 'UTC'))::date);

| Index | Size | |-------|------| | sale_sold_at_ix (full timestamp) | 214 MB | | sale_sold_at_date_ix (date only) | 66 MB |

The function-based index is 3x smaller because:

  • Dates are 4 bytes vs 8 bytes for timestamptz
  • Fewer distinct values enable deduplication

The Discipline Problem

Function-based indexes require exact expression match:

-- Uses the index βœ“
WHERE date_trunc('day', sold_at AT TIME ZONE 'UTC')::date 
      BETWEEN '2025-01-01' AND '2025-01-31'

-- Does NOT use the index βœ—
WHERE (sold_at AT TIME ZONE 'UTC')::date 
      BETWEEN '2025-01-01' AND '2025-01-31'

Solution: Virtual Generated Columns (PostgreSQL 18+)

ALTER TABLE sale ADD sold_at_date DATE
GENERATED ALWAYS AS (date_trunc('day', sold_at AT TIME ZONE 'UTC'));

Now queries use the virtual column:

SELECT sold_at_date, SUM(charged)
FROM sale
WHERE sold_at_date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY 1;

Benefits:

  • Smaller index
  • Faster queries
  • No discipline required β€” column guarantees correct expression
  • No ambiguity about timezones

Limitation: PostgreSQL 18 doesn't support indexes directly on virtual columns (yet).


Technique 3: Hash Index for Uniqueness

The Problem

You have a table with large URLs:

CREATE TABLE urls (
    id INT PRIMARY KEY,
    url TEXT NOT NULL,
    data JSON
);

You add a unique B-Tree index:

CREATE UNIQUE INDEX urls_url_unique_ix ON urls(url);

| Size | |------| | Table: 160 MB | | B-Tree index: 154 MB |

The index is almost as large as the table because B-Tree stores actual values in leaf blocks.

The Solution

Use an exclusion constraint with a hash index:

ALTER TABLE urls 
ADD CONSTRAINT urls_url_unique_hash 
EXCLUDE USING HASH (url WITH =);

| Index | Size | |-------|------| | B-Tree | 154 MB | | Hash | 32 MB |

The hash index is 5x smaller because it stores hash values, not the actual URLs.

Uniqueness Is Enforced

INSERT INTO urls (id, url) VALUES (1000002, 'https://example.com');
-- ERROR: conflicting key value violates exclusion constraint

Queries Still Fast

EXPLAIN ANALYZE SELECT * FROM urls WHERE url = 'https://example.com';
Index Scan using urls_url_unique_hash on urls
Execution Time: 0.022 ms  -- Faster than B-Tree's 0.046 ms!

Limitations

| Feature | B-Tree Unique | Hash Exclusion | |---------|--------------|----------------| | Foreign key reference | βœ“ | βœ— | | ON CONFLICT (column) | βœ“ | βœ— | | ON CONFLICT ON CONSTRAINT | βœ“ | βœ“ (DO NOTHING only) | | ON CONFLICT DO UPDATE | βœ“ | βœ— | | MERGE | βœ“ | βœ“ |

Workaround: Use MERGE

Instead of INSERT ... ON CONFLICT DO UPDATE:

MERGE INTO urls t
USING (VALUES (1000004, 'https://example.com')) AS s(id, url)
ON t.url = s.url
WHEN MATCHED THEN UPDATE SET id = s.id
WHEN NOT MATCHED THEN INSERT (id, url) VALUES (s.id, s.url);

Quick Reference

Diagnostic Queries

Check index sizes:

\di+ table_*

Compare index to table size:

SELECT 
    relname AS name,
    pg_size_pretty(pg_relation_size(oid)) AS size
FROM pg_class 
WHERE relname LIKE 'your_table%'
ORDER BY pg_relation_size(oid) DESC;

Check constraint_exclusion setting:

SHOW constraint_exclusion;

Decision Tree

Is the query scanning impossibly?
β”œβ”€β”€ Yes β†’ Enable constraint_exclusion
└── No
    ↓
Is index nearly as large as table?
β”œβ”€β”€ Yes, timestamp column β†’ Function-based index on date
β”œβ”€β”€ Yes, large text column β†’ Hash exclusion constraint
└── No β†’ Standard B-Tree is fine

Commands

| Command | Action | |---------|--------| | ANALYZE [table] | Analyze query performance | | CHECK-CONSTRAINTS | Evaluate constraint exclusion opportunity | | LOWER-CARDINALITY | Find function-based index opportunities | | HASH-UNIQUE | Evaluate hash index for large values | | COMPARE-INDEXES | Compare index sizes and performance |


Integration

| Direction | Skill | Relationship | |-----------|-------|--------------| | ← | debugging | Query debugging leads here | | β†’ | plan-then-execute | Systematic optimization |