PostgreSQL Performance Engineering

Production-grade PostgreSQL optimization from single-node tuning to globally distributed architectures: the storage and execution model, server and memory configuration, query and planner tuning, indexing and schema design, concurrency and maintenance, replication and high availability, and scaling out to extreme throughput.

29 concepts, ordered as a guided path. Start at the top and follow the thread.

How PostgreSQL stores and executes: the MVCC row-version model, heap pages and TOAST, the query planner and its cost model, and the write-ahead log. The mental models every later optimization rests on.

  1. 1
    MVCC and Tuple Visibility foundation

    How PostgreSQL's multiversion concurrency control works and why it underpins every later optimization: row versions stamped with xmin and xmax, snapshot-based visibility, what INSERT, UPDATE, and DELETE do physically, the dead-tuple bloat they create, the xmin horizon and the long-transaction trap, and the system columns that let you see it all.

  2. 2
    Heap Pages and TOAST foundation

    How PostgreSQL physically stores rows: the 8 kB page, the heap-tuple header and alignment padding, fillfactor, TOAST for oversized values, and the FSM/VM maps, plus the diagnostics to measure it all.

  3. 3
    The Query Planner and Cost Model foundation

    How the PostgreSQL planner sits in the query lifecycle and picks the lowest-estimated-cost plan, covering cost units, the sequential-scan cost formula, startup vs total cost, selectivity estimation, the plan-node menu, join-order search, plan caching, reading EXPLAIN, and the failure modes of a wrong estimate.

  4. 4
    WAL, Checkpoints, and Durability intermediate

    How PostgreSQL turns scattered random page writes into one sequential, fsync'd log to guarantee durability, how checkpoints bound recovery and recycle that log, and the trade-offs (full-page writes, synchronous_commit levels, checkpoint sizing) experts must get right.

Tuning the server to the workload and the hardware: memory (shared_buffers, work_mem, effective_cache_size), the write-ahead log and checkpoints, the planner cost settings, and autovacuum.

  1. 1
    Tuning Memory for Throughput intermediate

    How PostgreSQL's memory parameters (shared_buffers, effective_cache_size, work_mem, maintenance_work_mem, huge_pages) actually behave, how to measure cache hits and temp-file spills, and how to size them against total RAM so a tuning change does not invite the out-of-memory killer.

  2. 2
    WAL and Checkpoint Tuning intermediate

    How to tune PostgreSQL's write-ahead log and checkpoints for write throughput: sizing max_wal_size and checkpoint_timeout, cutting WAL with wal_compression and wal_level, choosing synchronous_commit levels and group commit, tuning the background writer, and reading the diagnostics, with the four classic failure modes.

  3. 3
    Tuning the Planner Cost Settings intermediate

    How to change PostgreSQL's planner cost constants so its plan choices match real hardware: lowering random_page_cost on SSD, sizing effective_cache_size and effective_io_concurrency, tuning parallel query and JIT, and rolling changes out safely.

  4. 4
    Autovacuum Tuning advanced

    How to make the PostgreSQL autovacuum daemon keep up at scale: why the default trigger starves large tables, per-table scale factors, insert-driven autovacuum, cost-based I/O throttling, the shared worker budget, vacuum memory, daemon-level freezing controls, autoanalyze, and the diagnostics that show whether autovacuum is winning or losing.

Making individual queries fast: reading EXPLAIN ANALYZE, choosing scan methods and join algorithms, understanding statistics and row estimation, and rewriting queries to avoid the planner's traps.

  1. 1
    Reading EXPLAIN ANALYZE intermediate

    How to read an EXPLAIN ANALYZE plan: the estimate-versus-measurement distinction, the option set, tree traversal order, the estimated-versus-actual-rows diagnostic, the per-loop reporting trap, BUFFERS, the per-node lines that name waste and spills, and how to find and safely explain the right query.

  2. 2
    Scan Methods and When Each Wins intermediate

    How PostgreSQL's planner chooses among Sequential, Index, Index Only, and Bitmap scans, why physical-row correlation and the visibility map decide whether an index helps, and how to read the plan and tune the boundary.

  3. 3
    Join Algorithms and Join Order advanced

    How PostgreSQL's three join algorithms (Nested Loop, Hash Join, Merge Join) actually work, how the planner chooses among them and orders many-way joins, how join types (inner, outer, semi, anti) execute, and how a row misestimate drives a catastrophic Nested Loop that better statistics, not hints, must fix.

  4. 4
    Statistics and Row Estimation advanced

    How ANALYZE samples data into pg_stats, how the planner converts those statistics into selectivities and row estimates, and why the independence assumption on correlated columns causes most bad plans, with the extended-statistics and target fixes.

  5. 5
    Query Rewriting and Anti-Patterns advanced

    How to rewrite SQL the PostgreSQL planner handles badly into equivalent forms it optimizes well, covering sargability, implicit casts, OR-to-UNION, semi-joins and the NOT IN null trap, CTE materialization, keyset pagination, N+1 batching, cheap COUNT/LIMIT/DISTINCT shapes, generic-plan skew, set-based writes, and proving each rewrite with EXPLAIN ANALYZE and pg_stat_statements.

Designing the on-disk shape of the data: B-tree internals and the other index types, advanced indexing (partial, expression, covering), data-type and schema choices, and table partitioning.

  1. 1
    Data Types and Schema Design intermediate

    Picking the right PostgreSQL types and constraints up front, from integer widths and timestamptz to uuidv7 versus bigint identity, jsonb versus columns, alignment-driven column ordering, NULL semantics, identity columns, and the constraint vocabulary that turns the schema into a spec.

  2. 2
    B-Tree Index Internals intermediate

    The default PostgreSQL index: its on-disk shape, the leftmost-prefix rule for multicolumn keys, insert and update mechanics, deduplication and bottom-up deletion, INCLUDE columns, operator classes, and the diagnostics for finding bloated or unused indexes.

  3. 3
    Index Types Beyond B-Tree advanced

    Choosing the right PostgreSQL index type when the B-tree falls short: GIN, GiST, SP-GiST, BRIN, Hash, and pgvector, the operator classes behind them, and the failure modes engineers actually hit.

  4. 4
    Advanced Indexing Techniques advanced

    How a principal PostgreSQL engineer shapes btree indexes for production: partial, expression, covering, multicolumn, and ordered indexes, built and repaired with CREATE/REINDEX CONCURRENTLY, kept healthy against bloat with pgstatindex and fillfactor, matched to queries with operator classes, and verified with EXPLAIN (ANALYZE, BUFFERS).

  5. 5
    Table Partitioning advanced

    PostgreSQL declarative table partitioning: RANGE/LIST/HASH methods, the partition-key-in-primary-key rule, partition pruning, partition-wise join and aggregate, the CONCURRENTLY index workflow, DETACH/ATTACH for rotation and bulk load, pg_partman automation, and the planning-time cost that caps partition counts at hundreds.

Keeping a busy database correct and lean under concurrency: isolation levels and the anomalies MVCC still allows, locking and deadlocks, table bloat and HOT updates, and vacuum, freezing, and transaction-ID wraparound.

  1. 1
    Isolation Levels and MVCC Anomalies advanced

    How PostgreSQL implements the three available isolation levels on top of MVCC, which anomalies each prevents (including write skew, which only Serializable catches), the SSI predicate-lock mechanism and its tuning knobs, the syntax for setting levels and using DEFERRABLE, the application retry-loop pattern for SQLSTATE 40001, how Oracle, SQL Server, and MySQL/InnoDB differ, and how to diagnose the cost in production.

  2. 2
    Locking and Deadlocks advanced

    How PostgreSQL's lock manager works: the eight table lock modes and their conflict matrix, the four row-lock modes and why FOR NO KEY UPDATE is the right default, SKIP LOCKED and NOWAIT for queue patterns, advisory locks as application mutexes, lock_timeout and friends, deadlock cycle detection with a worked example, ALTER TABLE as the silent killer with its fast/slow classification and add-then-validate workaround, and the pg_locks/pg_blocking_pids diagnostics to find what is blocking what right now.

  3. 3
    Table Bloat and HOT Updates advanced

    How table bloat actually grows, how to measure it with pgstattuple and the canonical bloat-estimating query, why the Heap-Only Tuple (HOT) update path is the single biggest defense against it, how fillfactor and indexed-column updates control HOT eligibility, how to diagnose HOT with pg_stat_user_tables, the lock implications of VACUUM FULL, CLUSTER, pg_repack, and pg_squeeze, REINDEX CONCURRENTLY for index bloat, and why the xmin horizon is the real root cause of most cleanup failures.

  4. 4
    Vacuum, Freezing, and Wraparound advanced

    The mechanics of VACUUM and freezing in PostgreSQL: VACUUM's three jobs, its phase pipeline (heap scan, index vacuum, heap vacuum, cleanup, truncate), the difference between VACUUM, VACUUM (FULL), CLUSTER, and pg_repack, the 32-bit transaction id wraparound danger and the autovacuum_freeze_max_age / vacuum_failsafe_age / shutdown safety stops, how freezing actually marks a tuple with HEAP_XMIN_FROZEN, the cooperation of vacuum_freeze_min_age and vacuum_freeze_table_age, the parallel multixact wraparound, the diagnostic queries on pg_database and pg_class, and the operational playbook for a high-age incident.

Surviving failure and scaling reads: physical streaming and synchronous replication, logical replication and change data capture, and automated high availability and failover.

  1. 1
    Streaming and Synchronous Replication advanced

    How PostgreSQL physical replication actually works: walsender on the primary streams WAL over libpq to a walreceiver on the standby, the startup process replays it, and the same WAL underpins hot standby reads, replication slots, and synchronous commits. Covers the four LSN positions, replica conflicts and hot_standby_feedback, quorum syntax in synchronous_standby_names, real cross-AZ and cross-region latency numbers, and the failure modes (stuck slot, canceled report, stalled commits, cross-region durability) that the design forces operators to face.

  2. 2
    Logical Replication and CDC advanced

    How PostgreSQL turns its WAL into a row-level change stream, how publications, subscriptions, replica identity, and slots wire it together, and how that machinery powers near-zero-downtime major-version upgrades, selective analytics replication, and CDC pipelines into Kafka with tools like Debezium.

  3. 3
    High Availability and Failover expert

    How to build HA on top of PostgreSQL: writing down RTO and RPO targets, running Patroni against a 3- or 5-node DCS quorum with a watchdog for split-brain prevention, promoting via pg_ctl promote and rejoining via pg_rewind, routing through HAProxy that follows Patroni's REST endpoint, combining synchronous replication and quorum failover for RPO=0, and grounding all of it on pgBackRest or wal-g archives with rehearsed point-in-time recovery.

Growing beyond one node: connection pooling at scale, read scaling and caching, sharding with Citus, and globally distributed PostgreSQL at extreme throughput.

  1. 1
    Connection Pooling at Scale advanced

    Why a PostgreSQL connection is expensive (a forked process, not a thread), the pool-mode trade-offs in PgBouncer (session, transaction, statement) and what session-level features each one breaks, protocol-level prepared statements in transaction mode since PgBouncer 1.21, sizing knobs and the surrounding PostgreSQL timeouts (statement_timeout, idle_in_transaction_session_timeout), application-side patterns for transaction pooling (SET LOCAL, advisory_xact_lock, dedicated LISTEN sessions), layered deployment, managed alternatives (RDS Proxy, AlloyDB), and the diagnostics and failure modes you meet in production.

  2. 2
    Read Scaling and Caching advanced

    How to scale PostgreSQL reads horizontally: the ladder from shared buffers through replicas, materialized views, and external caches, with routing, read-your-writes defenses, invalidation strategies, and the failure modes that look like database problems but are really caching mistakes.

  3. 3
    Sharding with Citus advanced

    Citus turns a PostgreSQL cluster into a distributed database via an extension: coordinator plus workers, distributed/reference/local table types, colocation as the design lever, shard count and logical-replication rebalancing, single-shard vs multi-shard vs repartition query plans, primary-key and foreign-key rules, the multi-tenant SaaS sweet spot, and the failure modes (low-cardinality keys, cross-tenant transactions, non-colocated joins, coordinator bottleneck) experienced engineers still hit.

  4. 4
    Globally Distributed PostgreSQL expert

    The capstone of the scaling-and-distributed track: how PostgreSQL stretches across regions, where the physics and the regulators force the topology choice, and when the right answer is no longer stock PostgreSQL at all. Covers the four topologies (single-region HA, primary plus regional replicas, multi-writer, geo-sharded) with real latency numbers, the NewSQL contenders (CockroachDB, YugabyteDB, Spanner, Aurora), Aurora Global Database's actual RPO and RTO, a decision tree that starts at residency, hybrid patterns that ship in production, global backup design with per-region pgBackRest plus cross-region object replication, and the seven failure modes that bite globally distributed PostgreSQL hardest.