pg_stat_statements or slow query log) →
Analyze (why via EXPLAIN (ANALYZE, BUFFERS)) → Fix (add
index, rewrite query, update statistics). 80% of slow queries are caused by missing indexes, stale
statistics, or sequential scans on large tables.EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) — runs the query
and shows actual execution plan with timing, row counts, and buffer usage. In PostgreSQL 18+, BUFFERS is auto-included with ANALYZE. Use explain.depesz.com to interpret
output visually.EXPLAIN ANALYZE on a destructive query
(DELETE, UPDATE) actually executes it. Wrap in
BEGIN/ROLLBACK or use EXPLAIN without ANALYZE.pg_stat_statements,
auto_explain, EXPLAIN BUFFERS) available since PostgreSQL 9.4. PG17+ adds incremental vacuum and streaming I/O; PG18+ adds async I/O and skip scans.UPDATE/DELETE/INSERT without wrapping in BEGIN/ROLLBACK. This is the #1 data-loss mistake.pg_stat_statements requires shared_preload_libraries and a restart — it cannot be loaded at runtime with CREATE EXTENSION alone. On managed databases (RDS, Cloud SQL), it may already be enabled.work_mem is per-sort-operation, not per-session — setting work_mem = '256MB' with 10 sort operations can use 2.5GB RAM. Increase globally with extreme caution.CREATE INDEX CONCURRENTLY cannot run inside a transaction block — and may leave invalid indexes on failure. Always verify with \di+ or pg_indexes after creation.pg_stat_statements and pg_stat_user_indexes counters reset. idx_scan = 0 is only meaningful since the last restart.pg_stat_activity for idle in transaction sessions.| # | Symptom / Goal | Command / Solution | Notes |
|---|---|---|---|
| 1 | Find the slowest queries | SELECT ... FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 20 |
Requires pg_stat_statements extension [src2] |
| 2 | Find currently running long queries | SELECT pid, now()-query_start AS duration, query FROM pg_stat_activity WHERE state='active' ORDER BY duration DESC
|
Real-time view [src1] |
| 3 | Analyze a query's execution plan | EXPLAIN (ANALYZE, BUFFERS) SELECT ... |
Runs query; shows actual time + buffer hits. PG18+: BUFFERS auto-included [src1, src8] |
| 4 | Analyze without running (safe for DML) | EXPLAIN SELECT ... |
Estimated plan only; does not execute [src1] |
| 5 | Enable slow query logging | log_min_duration_statement = 500 |
Logs queries > 500ms [src6] |
| 6 | Auto-log slow query plans | Load auto_explain, set auto_explain.log_min_duration = 1000 |
No manual EXPLAIN needed [src3] |
| 7 | Check for missing indexes | Look for "Seq Scan" on large tables in EXPLAIN output | Rows × loops is cost indicator [src1, src5] |
| 8 | Check index usage statistics | SELECT * FROM pg_stat_user_indexes |
Shows idx_scan = 0 for unused indexes [src2] |
| 9 | Check statistics freshness | SELECT relname, last_analyze FROM pg_stat_user_tables |
Stale stats → bad plans [src1] |
| 10 | Update statistics manually | ANALYZE tablename |
Run after bulk inserts/deletes [src1] |
| 11 | Find unused indexes | SELECT * FROM pg_stat_user_indexes WHERE idx_scan = 0 |
Unused indexes slow writes [src2] |
| 12 | Find table bloat | SELECT * FROM pg_stat_user_tables ORDER BY n_dead_tup DESC |
High dead tuples → VACUUM [src2] |
| 13 | Create index without locking table | CREATE INDEX CONCURRENTLY idx_name ON table(col) |
Safe in production [src1] |
START — Query is slow
│─── Step 1: WHERE is time being spent?
│ │─── pg_stat_statements → ORDER BY total_exec_time/mean_exec_time DESC [src2]
│ │─── pg_stat_activity → real-time long-running queries
│ └─── slow query log → log_min_duration_statement [src6]
│
│─── Step 2: EXPLAIN (ANALYZE, BUFFERS) on the slow query
│ │─── "Seq Scan" on large table?
│ │ │─── No WHERE clause → expected full scan
│ │ │─── Low selectivity → use partial index [src1]
│ │ └─── Missing index → CREATE INDEX CONCURRENTLY [src1]
│ │─── estimated rows >> actual rows (or vice versa)?
│ │ └─── Stale statistics → ANALYZE tablename [src1]
│ │─── Nested Loop with large outer row count?
│ │ └─── Add index on join column [src1, src4]
│ │─── Sort in plan (no index for ORDER BY)?
│ │ └─── Add index on ORDER BY columns [src1]
│ │─── "Buffers: shared read=N" — high N?
│ │ └─── Increase shared_buffers; check work_mem [src1, src4]
│ └─── Hash Join with high Memory Usage?
│ └─── SET work_mem = '64MB' for the session [src1]
│
└─── Step 3: Fix
│─── Add index → CREATE INDEX CONCURRENTLY [src1]
│─── Rewrite query → avoid SELECT *, func on indexed col, OFFSET [src4, src5]
│─── Update stats → ANALYZE; tune autovacuum [src1]
└─── Tune memory → work_mem, shared_buffers [src1, src4]
The most important tool for identifying slow queries across your entire database. [src2]
# postgresql.conf
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
pg_stat_statements.max = 10000
-- After restart:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Top 20 slowest queries by total time
SELECT
round(total_exec_time::numeric, 2) AS total_ms,
round(mean_exec_time::numeric, 2) AS mean_ms,
calls,
left(query, 120) AS query_snippet
FROM pg_stat_statements
WHERE calls > 10
ORDER BY total_exec_time DESC
LIMIT 20;
Verify: SELECT count(*) FROM pg_stat_statements; → should return > 0
For real-time capture in logs. [src6]
# postgresql.conf
log_min_duration_statement = 500 # log queries > 500ms
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '
log_checkpoints = on
log_lock_waits = on
log_temp_files = 0
Verify: Run a known slow query, then check pg_log directory for the logged entry.
The primary tool for understanding why a specific query is slow. [src1, src5]
-- Safe estimate (no execution)
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;
-- Full diagnostic (RUNS the query!)
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE customer_id = 42;
-- For DML — ALWAYS wrap in transaction
BEGIN;
EXPLAIN (ANALYZE, BUFFERS)
UPDATE orders SET status = 'processed' WHERE created_at < NOW() - INTERVAL '1 year';
ROLLBACK;
Reading the output:
Seq Scan on large table → missing indexactual rows >> estimated rows → stale statistics, run
ANALYZEBuffers: shared read=N (high N) → data not cached; increase shared_buffers
Sort Method: external merge Disk → increase work_memRows Removed by Filter: N → index on filter column neededPostgreSQL 18+ note: BUFFERS output is now automatically included when you use ANALYZE, so EXPLAIN (ANALYZE) SELECT ... is sufficient. [src8]
Automatically logs plans for slow queries without manual EXPLAIN. [src3]
# postgresql.conf
shared_preload_libraries = 'pg_stat_statements,auto_explain'
auto_explain.log_min_duration = 1000 # > 1 second
auto_explain.log_analyze = on
auto_explain.log_buffers = on
auto_explain.log_timing = on
auto_explain.log_nested_statements = on
auto_explain.sample_rate = 1.0
Verify: Run a query > 1s, then check PostgreSQL logs for the plan output.
The most common fix. Always use CONCURRENTLY in production. [src1, src4]
-- Single column
CREATE INDEX CONCURRENTLY idx_orders_customer_id ON orders (customer_id);
-- Composite (equality columns first, range last)
CREATE INDEX CONCURRENTLY idx_orders_status_created ON orders (status, created_at DESC);
-- Partial index (only the subset you query)
CREATE INDEX CONCURRENTLY idx_orders_pending ON orders (created_at) WHERE status = 'pending';
-- Covering index (avoids heap access)
CREATE INDEX CONCURRENTLY idx_orders_covering ON orders (customer_id) INCLUDE (status, total_amount);
-- JSONB
CREATE INDEX CONCURRENTLY idx_events_metadata ON events USING GIN (metadata);
Verify: EXPLAIN ANALYZE should show "Index Scan" or "Index Only Scan" instead of "Seq Scan".
When the planner estimates badly. [src1]
ANALYZE orders; -- update one table
ANALYZE; -- update all tables
-- Check freshness
SELECT relname, last_analyze, last_autoanalyze, n_live_tup, n_dead_tup
FROM pg_stat_user_tables ORDER BY last_analyze ASC NULLS FIRST;
-- Tune autovacuum for high-churn tables
ALTER TABLE orders SET (
autovacuum_analyze_scale_factor = 0.01,
autovacuum_analyze_threshold = 100
);
Verify: Re-run EXPLAIN ANALYZE — estimated rows should now be close to actual rows.
-- Top 10 slowest queries by total time
SELECT
'Query #' || row_number() OVER (ORDER BY total_exec_time DESC) AS rank,
round(total_exec_time::numeric, 1) AS total_ms,
round(mean_exec_time::numeric, 1) AS mean_ms,
calls,
left(regexp_replace(query, '\s+', ' ', 'g'), 100) AS query_snippet
FROM pg_stat_statements
WHERE calls > 5
ORDER BY total_exec_time DESC LIMIT 10;
-- Tables with possible missing indexes
SELECT relname, seq_scan, idx_scan,
CASE WHEN seq_scan + idx_scan > 0
THEN round(idx_scan::numeric * 100 / (seq_scan + idx_scan), 1)
ELSE NULL END AS idx_pct,
n_live_tup AS rows
FROM pg_stat_user_tables
WHERE n_live_tup > 10000 AND seq_scan > idx_scan
ORDER BY seq_scan DESC LIMIT 10;
-- Unused indexes
SELECT tablename, indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexrelname NOT LIKE '%pkey%'
ORDER BY pg_relation_size(indexrelid) DESC LIMIT 10;
-- Tables with stale statistics
SELECT relname, last_analyze::date, n_live_tup AS rows, n_dead_tup AS dead
FROM pg_stat_user_tables
WHERE (last_analyze < NOW() - INTERVAL '7 days' OR last_analyze IS NULL)
AND n_live_tup > 1000
ORDER BY n_live_tup DESC LIMIT 10;
#!/usr/bin/env python3
"""
Input: PostgreSQL DSN + threshold_ms (default 100ms)
Output: Ranked slow query report with recommendations
Requirements: pip install psycopg2-binary
"""
import psycopg2, sys
def get_slow_queries(dsn: str, threshold_ms: float = 100.0, limit: int = 20):
conn = psycopg2.connect(dsn)
cur = conn.cursor()
cur.execute("SELECT 1 FROM pg_extension WHERE extname = 'pg_stat_statements'")
if not cur.fetchone():
raise RuntimeError("pg_stat_statements not installed — add to shared_preload_libraries")
cur.execute("""
SELECT
calls,
round(total_exec_time::numeric,2) AS total_ms,
round(mean_exec_time::numeric,2) AS mean_ms,
round(max_exec_time::numeric,2) AS max_ms,
shared_blks_read AS disk_reads,
shared_blks_hit + shared_blks_read AS total_blocks,
left(regexp_replace(query, E'\\\\s+', ' ', 'g'), 200) AS query_text
FROM pg_stat_statements
WHERE mean_exec_time > %s AND calls > 5
ORDER BY total_exec_time DESC LIMIT %s
""", (threshold_ms, limit))
cols = [d[0] for d in cur.description]
rows = [dict(zip(cols, r)) for r in cur.fetchall()]
for r in rows:
recs = []
if r['disk_reads'] > r['total_blocks'] * 0.5:
recs.append('High disk reads → increase shared_buffers or add index')
if 'SELECT *' in r['query_text'].upper():
recs.append('Avoid SELECT * — specify only needed columns')
if 'OFFSET' in r['query_text'].upper():
recs.append('OFFSET pagination is O(n) → use keyset/cursor pagination')
r['recommendations'] = recs or ['Run EXPLAIN (ANALYZE, BUFFERS) for details']
cur.close(); conn.close()
return rows
if __name__ == '__main__':
dsn = sys.argv[1] if len(sys.argv) > 1 else 'postgresql://localhost/mydb'
threshold = float(sys.argv[2]) if len(sys.argv) > 2 else 100.0
print(f"\n=== PostgreSQL Slow Query Report (mean > {threshold}ms) ===\n")
print(f"{'#':<3} {'mean_ms':>8} {'calls':>7} {'total_ms':>10} Query")
print("-" * 100)
for i, r in enumerate(get_slow_queries(dsn, threshold), 1):
print(f"{i:<3} {r['mean_ms']:>8} {r['calls']:>7} {r['total_ms']:>10} {r['query_text'][:70]}")
for rec in r['recommendations']:
print(f" → {rec}")
#!/bin/bash
DB="${PGDATABASE:-mydb}" HOST="${PGHOST:-localhost}"
PORT="${PGPORT:-5432}" USER="${PGUSER:-postgres}"
PSQL="psql -h $HOST -p $PORT -U $USER -d $DB -t -A -F'|'"
echo "=== PostgreSQL Index Opportunity Report — $(date) ==="
echo "--- Sequential scan heavy tables ---"
eval "$PSQL" -c "
SELECT relname, seq_scan, idx_scan, n_live_tup,
pg_size_pretty(pg_total_relation_size(relid)) AS size
FROM pg_stat_user_tables
WHERE n_live_tup > 10000 AND seq_scan > idx_scan * 2
ORDER BY seq_scan * n_live_tup DESC LIMIT 10;" | while IFS='|' read -r t seq idx rows size; do
echo " $t | rows=$rows size=$size | seq=$seq idx=$idx"
echo " ACTION: Identify query on $t → CREATE INDEX CONCURRENTLY"
done
echo "--- Unused indexes ---"
eval "$PSQL" -c "
SELECT tablename, indexname, pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexrelname NOT LIKE '%pkey%'
ORDER BY pg_relation_size(indexrelid) DESC LIMIT 10;" | while IFS='|' read -r t idx size; do
echo " $t.$idx ($size) — consider: DROP INDEX CONCURRENTLY $idx"
done
echo "--- Tables needing ANALYZE ---"
eval "$PSQL" -c "
SELECT relname, COALESCE(last_analyze::date::text,'never') AS last, n_live_tup
FROM pg_stat_user_tables
WHERE (last_analyze < NOW()-INTERVAL '7 days' OR last_analyze IS NULL) AND n_live_tup > 1000
ORDER BY n_live_tup DESC LIMIT 5;" | while IFS='|' read -r t date rows; do
echo " $t (last=$date rows=$rows) → ANALYZE $t;"
done
-- ❌ BAD — estimates can be wildly wrong [src1, src5]
EXPLAIN SELECT * FROM orders WHERE status = 'pending';
-- Shows "rows=42" — but actual rows may be 420,000
-- Plan looks fine on paper; catastrophically slow in reality
-- ✅ GOOD — actual execution data reveals the truth [src1, src5]
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE status = 'pending';
-- "actual rows=420000 loops=1" vs "rows=42" → stale statistics!
-- Fix: ANALYZE orders;
-- ❌ BAD — index on a column with few distinct values is ignored [src1, src4]
CREATE INDEX idx_orders_is_active ON orders (is_active);
-- If 95% of rows are is_active = true, the planner skips this index
-- Seq Scan is faster when selectivity is low
-- ✅ GOOD — index only the actionable minority [src1, src4]
CREATE INDEX CONCURRENTLY idx_orders_inactive
ON orders (created_at) WHERE is_active = false;
-- Tiny index; only covers 5% of rows — very fast for inactive queries
-- ❌ BAD — O(n) cost; gets slower with every page [src4, src5]
SELECT * FROM events ORDER BY created_at DESC LIMIT 20 OFFSET 100000;
-- Must scan and discard 100,000 rows to return 20
-- ✅ GOOD — O(log n); same speed regardless of page number [src4, src5]
-- First page:
SELECT * FROM events ORDER BY created_at DESC, id DESC LIMIT 20;
-- Next page (use last row's values as cursor):
SELECT * FROM events
WHERE (created_at, id) < ('2026-02-19 12:00:00', 12345)
ORDER BY created_at DESC, id DESC
LIMIT 20;
-- Requires: CREATE INDEX ON events (created_at DESC, id DESC);
EXPLAIN ANALYZE on
UPDATE, DELETE, or INSERT actually modifies data. Always wrap in
BEGIN/ROLLBACK for destructive statements. [src1]ANALYZE tablename
manually after large data changes. [src1, src4]WHERE status = 'pending' AND created_at > '2026-01-01', the index
(created_at, status) is less efficient than (status, created_at). Put equality
columns first, range columns last. [src1, src4]WHERE lower(email) = '[email protected]'
cannot use an index on email. Use expression indexes:
CREATE INDEX ON users (lower(email)). [src1]INSERT,
UPDATE, DELETE. On write-heavy tables, 10+ indexes can make writes 5× slower.
Audit with pg_stat_user_indexes WHERE idx_scan = 0. [src2, src4]SELECT * prevents index-only scans: Selecting all columns forces
PostgreSQL to visit the heap for every row. Specify only needed columns, or use covering indexes
(INCLUDE) to enable index-only scans. [src1, src5]-- Find slow queries
SELECT round(total_exec_time::numeric,1) AS total_ms,
round(mean_exec_time::numeric,1) AS mean_ms,
calls, left(query, 100) AS query
FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 20;
-- Currently running > 5 seconds
SELECT pid, now()-query_start AS duration, state, left(query,100)
FROM pg_stat_activity
WHERE state = 'active' AND now()-query_start > interval '5 seconds'
ORDER BY duration DESC;
-- Analyze a query
EXPLAIN (ANALYZE, BUFFERS) SELECT ...;
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) SELECT ...;
-- Missing index candidates
SELECT relname, seq_scan, idx_scan, n_live_tup
FROM pg_stat_user_tables
WHERE seq_scan > idx_scan AND n_live_tup > 10000
ORDER BY seq_scan DESC;
-- Unused indexes
SELECT tablename, indexname, pg_size_pretty(pg_relation_size(indexrelid))
FROM pg_stat_user_indexes WHERE idx_scan = 0 AND indexrelname NOT LIKE '%pkey%';
-- Index bloat
SELECT relname, n_live_tup, n_dead_tup,
round(n_dead_tup::numeric*100/NULLIF(n_live_tup+n_dead_tup,0),1) AS dead_pct
FROM pg_stat_user_tables WHERE n_dead_tup > 10000 ORDER BY n_dead_tup DESC;
-- Statistics freshness
SELECT relname, last_analyze, last_autoanalyze FROM pg_stat_user_tables
WHERE last_analyze < NOW()-INTERVAL '7 days' OR last_analyze IS NULL;
ANALYZE tablename;
VACUUM ANALYZE tablename;
-- Memory settings
SHOW shared_buffers;
SHOW work_mem;
SET work_mem = '64MB';
-- Online tools:
-- explain.depesz.com — visual EXPLAIN output analyzer
-- explain.dalibo.com — alternative EXPLAIN visualizer
-- pganalyze.com — continuous query monitoring
| Feature | Available Since | Notes |
|---|---|---|
EXPLAIN ANALYZE |
PostgreSQL 7.1 | Core diagnostic tool [src1] |
EXPLAIN BUFFERS |
PostgreSQL 9.0 | Shows buffer hits/misses [src1] |
pg_stat_statements |
PostgreSQL 9.2 | Must be in shared_preload_libraries [src2] |
auto_explain |
PostgreSQL 8.4 | Must be in shared_preload_libraries [src3] |
CREATE INDEX CONCURRENTLY |
PostgreSQL 8.2 | Non-locking index creation [src1] |
Covering indexes (INCLUDE) |
PostgreSQL 11 | Avoids heap access for covered columns [src1] |
EXPLAIN FORMAT JSON/XML/YAML |
PostgreSQL 9.0 | Machine-readable plan output [src1] |
pg_stat_statements.track_planning |
PostgreSQL 13 | Track planning time separately [src2] |
log_min_duration_sample |
PostgreSQL 14 | Sample fraction of long queries [src1] |
| Incremental VACUUM (dirty page bitmap) | PostgreSQL 17 | Overhead proportional to write volume, not table size [src7] |
| Streaming I/O (sequential reads) | PostgreSQL 17 | Read stream API for faster seq scans and ANALYZE [src7] |
| B-tree IN-clause optimization | PostgreSQL 17 | Multi-value lookups significantly faster [src7] |
| VACUUM memory usage reduced 20× | PostgreSQL 17 | No longer limited to 1GB for maintenance_work_mem [src7] |
EXPLAIN ANALYZE auto-includes BUFFERS |
PostgreSQL 18 | No need to specify BUFFERS separately [src8] |
| Async I/O subsystem | PostgreSQL 18 | io_method config; benefits seq scans, bitmap heap scans, vacuum [src8] |
| Skip scan (multi-column B-tree) | PostgreSQL 18 | Indexes usable without restrictions on leading columns [src8] |
| Self-join elimination | PostgreSQL 18 | Planner removes unnecessary self-joins automatically [src8] |
pg_stat_statements parallel worker tracking |
PostgreSQL 18 | parallel_workers_to_launch, parallel_workers_launched [src8] |
| Parallel GIN index builds | PostgreSQL 18 | Faster GIN index creation for JSONB/full-text workloads [src8] |
pg_upgrade preserves optimizer stats |
PostgreSQL 18 | No expensive re-ANALYZE after major version upgrade [src8] |
| Use for | Don't confuse with |
|---|---|
Slow individual query → EXPLAIN (ANALYZE, BUFFERS) |
Database-wide slowdown → check pg_stat_activity for locks |
Finding worst queries → pg_stat_statements |
Connection limit issues → check max_connections |
Missing index → pg_stat_user_tables seq_scan count |
Hardware I/O bottleneck → check OS iostat |
Stale statistics → ANALYZE |
Table bloat → VACUUM (not ANALYZE) |
Production plan capture → auto_explain |
Real-time lock waits → pg_locks join pg_stat_activity |
EXPLAIN ANALYZE has measurement overhead: Instrumentation adds 5–15%
overhead. Plans for very fast queries (<1ms) may appear disproportionately slow under EXPLAIN
ANALYZE. [src1]pg_stat_statements normalizes parameters: It replaces literal values with
$1, $2 etc. to group similar queries. You can't see actual parameter values —
use pg_stat_activity or slow query logs for that. [src2]pg_stat_activity WHERE backend_type = 'autovacuum worker'. [src1, src4]work_mem is per-sort-operation, not per-session: Setting
work_mem = '256MB' in a session with 10 sort operations can use 2.5GB of RAM. Increase
globally with caution; set per-session for known heavy queries. [src1]pg_stat_statements and
pg_stat_user_indexes/tables counters reset on restart. idx_scan = 0 is only
meaningful since the last reset. [src2]EXPLAIN (ANALYZE) automatically includes BUFFERS output. Existing scripts that parse EXPLAIN output may see unexpected additional lines. [src8]