Join Algorithms and Join Order

How PostgreSQL's three join algorithms (Nested Loop, Hash Join, Merge Join) actually work, how the planner chooses among them and orders many-way joins, how join types (inner, outer, semi, anti) execute, and how a row misestimate drives a catastrophic Nested Loop that better statistics, not hints, must fix.

advanced 28 min read in Query Optimization

postgresqlquery-optimizationjoinshash-joinnested-loopmerge-joinquery-plannerexplain

Learning outcomes

A join is where most slow queries are won or lost. The planner has three ways to combine two tables and a factorial number of orders to combine many, and one wrong row estimate can send it down a path that runs a thousand times slower than the right one. This page makes those choices legible, so that when you read a plan you know not just what PostgreSQL did but why, and what to change when it chose wrong. It builds directly on the Scan Methods and When Each Wins page (how each table is read) and the Reading EXPLAIN ANALYZE page (how to read the plan tree).

After studying this page, you can:

Describe how a Nested Loop, a Hash Join, and a Merge Join each combine two inputs, and name the input shape each one wins on.
Predict when a Hash Join will spill to disk and read the Batches and Disk numbers that prove it.
Explain why join order often matters more than join algorithm, and how join_collapse_limit and GEQO bound the planner’s search.
Tell a semi join from an anti join, and explain why EXISTS often beats IN and why NOT IN over a nullable column is a correctness trap.
Spot a row misestimate that drove the planner into a catastrophic Nested Loop, and fix it with statistics rather than hints.

Before we dive in

You should be comfortable reading an EXPLAIN ANALYZE plan tree: nodes nest, the indented child feeds its parent, and each node reports estimated rows next to actual rows. You should also know the basic scan nodes (Seq Scan, Index Scan, Index Only Scan, Bitmap Heap Scan) and when each is cheap, because a join’s cost is mostly the cost of how it reads its two inputs. Both topics have their own pages; this one assumes them.

A few terms, defined as we use them. A join combines rows from two relations into one result by matching a condition, usually equality on a key. The two inputs are the outer side (the one driven across in a loop, drawn as the upper input in the plan) and the inner side (the one searched for each outer row). An equi-join is a join whose condition is equality, like a.id = b.a_id; some algorithms only work on equi-joins. An intermediate result is the set of rows produced after joining some tables but before joining the rest. Keep those four in mind. Everything below is built from them.

Mental Model

The tempting wrong model is that a join is one operation with one cost, the way SELECT from a single table is. Under that model you tune a slow join by adding an index and hoping, and you treat “the join” as a black box that either is fast or is not.

The better model is that a join is a tiny program the planner writes, and it has both an algorithm and an order to choose. The algorithm is how it matches two inputs: walk one and look up the other (Nested Loop), hash one and probe it (Hash Join), or sort both and zip them (Merge Join). The order is which pair of tables it joins first when there are several, so the in-between results stay small. The planner enumerates many such programs, prices each with its cost model, and runs the cheapest. When a join is slow, you are not looking at a black box. You are looking at one program the planner chose, and your job is to see which assumption made it choose wrong. Almost always that assumption is a row-count estimate.

Breaking it down

1. Why a join needs an algorithm at all

Start with the problem, because the three algorithms only make sense against it. You have two tables, orders and customers, and you want every order paired with its customer: orders.customer_id = customers.id. The naive way is to take each order and look through every customer for a match. With ten thousand orders and ten thousand customers that is a hundred million comparisons for a result of ten thousand rows. The work is the product of the inputs, and it explodes.

So a join is not free even when the result is small. The size of the answer tells you nothing about the cost of computing it. That gap is exactly why PostgreSQL has three different join algorithms: each one trades work in a different way, and each wins on a different shape of input. One avoids the product by looking up the inner side through an index. One avoids it by hashing. One avoids it by sorting both sides first. The whole rest of this page is those three trades and when each pays off.

flowchart TB
    Q["JOIN orders to customers ON customer_id = id"] --> NL["Nested Loop: look up inner per outer row"]
    Q --> HJ["Hash Join: hash one side, probe with the other"]
    Q --> MJ["Merge Join: sort both, then zip"]
    NL --> P["Planner prices all three, runs the cheapest"]
    HJ --> P
    MJ --> P

2. Nested Loop: scan the inner side once per outer row

The Nested Loop is the join you would invent yourself. For each row on the outer side, it scans the inner side for matches. That is two nested loops, hence the name, and its cost is roughly the outer row count times the cost of one inner search.

The whole story of the Nested Loop is what that inner search costs. If the inner side has no useful index, each search is a full scan of the inner table, and the total is outer_rows * inner_rows. Ten thousand by ten thousand is a hundred million row touches: catastrophic, and the classic signature of a join gone wrong. But if the inner side has an index on the join key, each search is a cheap index lookup, and the total is outer_rows times the cost of one index probe. When the outer side is small, say a few hundred rows after a selective filter, a Nested Loop with an indexed inner side is often the fastest plan there is. It touches almost nothing.

So the Nested Loop wins in exactly one situation: the outer side is small and the inner side is indexed on the join key. It is the planner’s choice for “fetch a few rows, then look up their matches one at a time.” It is the planner’s disaster when a wrong estimate makes a large outer side look small, which is rung 9.

Nested Loop  (cost=0.43..900.50 rows=200)
  ->  Index Scan on orders        (rows=200)        -- small outer
  ->  Index Scan on customers     (rows=1)          -- one indexed probe per outer row
        Index Cond: (id = orders.customer_id)

PostgreSQL 14 added Memoize, a cache that sits on the inner side of a Nested Loop. When the same outer key repeats, Memoize returns the cached inner rows instead of re-probing the index. It helps when outer keys repeat heavily, for example joining a large line_items table to a small products table where the same product id appears thousands of times. You will see a Memoize node above the inner scan, with a hit-ratio line. It does not change the algorithm; it just removes redundant inner lookups.

-- A Nested Loop is the right plan here: tiny outer, indexed inner.
EXPLAIN (ANALYZE, BUFFERS)
SELECT o.id, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.created_at >= now() - interval '1 hour';   -- only a few hundred recent orders

3. Hash Join: build a table, then probe it

When both inputs are large and unsorted, looking up the inner side one row at a time is too slow, and the Hash Join takes over. It runs in two phases. First the build phase reads the smaller input in full and loads it into an in-memory hash table keyed on the join column. Then the probe phase reads the larger input and, for each row, hashes its key and looks in the table for matches. Each side is read exactly once. The cost is linear in the sum of the inputs, not their product, which is why the Hash Join is the workhorse for large equi-joins.

Two constraints define it. First, it is equality only. A hash table answers “is this key present,” not “is this key greater than,” so a Hash Join cannot serve a range or inequality join condition. Second, the build side must fit in memory, and that is where it gets interesting. The hash table’s budget is work_mem multiplied by hash_mem_multiplier (default 2.0), so a hash node may use up to twice work_mem before it spills. When the build side is too big for that budget, PostgreSQL partitions both inputs into batches by a hash of the key, keeps one batch in memory, and writes the rest to temporary files, processing them batch by batch. The Tuning Memory for Throughput page covers how to size that budget.

You read the spill straight off the plan. A healthy in-memory Hash Join shows Batches: 1. A spilled one shows Batches greater than 1 and a Disk: figure for the temp-file traffic. More batches means more passes over disk, which is the signal to raise work_mem for that query or shrink the build side with a better filter.

Hash Join  (cost=270.00..540.00 rows=10000)
  Hash Cond: (o.customer_id = c.id)
  ->  Seq Scan on orders o     (rows=1000000)               -- probe side (larger)
  ->  Hash  (rows=10000)                                    -- build side (smaller)
        Buckets: 16384  Batches: 8  Memory Usage: 4096kB    -- Batches > 1: it spilled
        ->  Seq Scan on customers c  (rows=10000)

A Hash Join, build then probe

Pick the build sideThe planner builds on the smaller input, here customers. Smaller build means a smaller hash table and a better chance of staying in memory.

Step 1 of 5

4. Merge Join: zip two sorted streams

The Merge Join joins two inputs that are both sorted on the join key. It walks the two streams together with two cursors, advancing whichever is behind, and emitting a row whenever the keys match. Picture merging two already-sorted stacks of paper by name: you never look back, you just zip down both. Each side is read once, in order, and the matching is a single linear pass.

The catch is the precondition: both inputs must arrive sorted on the join key. That order can come for free or it can cost a sort. It is free when an index on the join column already returns rows in key order, so the planner reads the index and feeds the Merge Join directly. It costs a sort when the input is unsorted, in which case the planner inserts a Sort node below the Merge Join to produce the order first. So a Sort feeding a Merge Join is not an accident; it is the planner deciding that sorting both sides and merging is cheaper than the alternatives.

When does that pay? Two cases. First, when the inputs are already sorted, because then the merge is nearly free and beats building a hash table. Second, on very large joins where even after paying for two sorts, the merge is cheaper than a hash that would spill across many batches. Like the Hash Join, the textbook Merge Join handles equality, but unlike the Hash Join it can also serve range conditions, since sorted order lets it reason about greater-than and less-than as it walks.

Merge Join  (cost=0.85..1100.00 rows=10000)
  Merge Cond: (o.customer_id = c.id)
  ->  Index Scan on orders_customer_id_idx     (rows=10000)   -- already sorted by the index
  ->  Sort                                                    -- planner sorts the other side
        Sort Key: c.id
        ->  Seq Scan on customers c             (rows=10000)

5. How the planner picks one of the three

The planner does not have a favorite. For every pair of relations it could join, it prices all three algorithms with its cost model and keeps the cheapest, then carries that choice up the tree. Two inputs drive the decision: the estimated sizes of the two inputs, and whether either input already arrives sorted.

The logic, stated plainly: a small outer side with an indexed inner side favors the Nested Loop, because a few indexed lookups beat building any table. Two large unsorted inputs joined on equality favor the Hash Join, because one linear build-and-probe beats everything else. Inputs that are already sorted on the join key, or so large that a hash would spill badly, favor the Merge Join, because the merge is then nearly free or the sort is worth it. The table below is the cheat sheet.

Algorithm	Best case	Memory use	Needs sorted input	Equality only
Nested Loop	small outer, indexed inner	almost none	no	no
Hash Join	large unsorted equi-join	build side in `work_mem * hash_mem_multiplier`	no	yes
Merge Join	inputs already sorted, or very large	sort buffers if a Sort is needed	yes	no (handles ranges)

Which join, and why

For each outer row, search the inner side. Wins when the outer side is small and the inner side has an index on the join key, so each search is one cheap probe. Without an index it degrades to outer times inner and is catastrophic on large inputs. Memoize (PG14) caches inner lookups when outer keys repeat. The only algorithm that handles non-equality conditions naturally.

6. Join order: the choice that beats the algorithm

Here is the rung that separates someone who can name the algorithms from someone who can fix a slow query. When a query joins more than two tables, the planner must also choose the order to join them, and that choice usually matters more than which algorithm runs at each step.

Why does order matter so much? Because the cost of a join depends on the size of its inputs, and the order decides how big the intermediate results are. Join the two tables whose match produces the fewest rows first, and every later join works on a small input. Join in a careless order and an early step can balloon to millions of rows that every later step must then grind through. The intuition experienced engineers carry is drive from the most selective filter first: start with the table whose WHERE clause cuts it down the hardest, so the very first intermediate result is tiny, and keep it tiny.

The trouble is the search space. The number of possible join orders grows factorially in the number of tables, so an exhaustive search becomes impossible past a dozen or so. PostgreSQL bounds the work with three knobs. join_collapse_limit (default 8) caps how many FROM-list and explicit-JOIN items the planner will flatten into one group to reorder freely; above that limit it stops merging and respects more of your written order. from_collapse_limit (default 8) does the same for subqueries pulled up into the parent query. And once the number of tables in one group reaches geqo_threshold (default 12), the Genetic Query Optimizer takes over: instead of examining every order, GEQO searches the space heuristically, trading a guarantee of the optimum for a plan found in reasonable time.

flowchart LR
    A["N tables to join"] --> B{"Group size below geqo_threshold (12)?"}
    B -->|"yes"| C["Exhaustive search of orders, bounded by join_collapse_limit (8)"]
    B -->|"no"| D["GEQO heuristic search, good plan not guaranteed optimal"]
    C --> E["Cheapest plan runs"]
    D --> E

7. Join types: inner, outer, semi, and anti

So far we have talked about how to match rows. The join type decides what to do with rows that have no match, and it shapes both correctness and the plan. There are more types than the inner join you write most often.

An inner join keeps only rows that matched on both sides. A left outer join keeps every outer row and fills the inner columns with nulls where there was no match; right and full outer joins extend that to the other side and to both. These are about which unmatched rows survive. Two more types are about existence rather than columns, and they are the ones that surprise people.

A semi join answers “does at least one match exist.” It is what WHERE EXISTS (...) and IN (...) compile to. The key behavior is that it stops at the first match and emits the outer row once, never multiplying it by the number of inner matches. An anti join is the mirror: it keeps outer rows that have no match, which is what WHERE NOT EXISTS (...) compiles to. The plan literally says Semi Join or Anti Join in the node, which tells you the planner understood your intent.

This is why EXISTS often beats IN. EXISTS maps cleanly to a semi join that stops at the first hit, while a poorly written IN over a subquery can materialize the whole inner set first. And it is why NOT IN over a nullable column is a correctness trap, not just a performance one. If the inner subquery returns even one NULL, SQL’s three-valued logic makes NOT IN return no rows at all, because “is x not in this set” becomes unknown the moment a NULL is in the set. NOT EXISTS does not have this flaw and compiles to a clean anti join. Prefer NOT EXISTS.

-- Semi join: customers who have placed at least one order. Stops at the first match.
SELECT c.id, c.name
FROM customers c
WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id);

-- Anti join: customers with no orders. Prefer NOT EXISTS over NOT IN.
SELECT c.id, c.name
FROM customers c
WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id);

-- The trap: if orders.customer_id contains any NULL, this returns ZERO rows, silently.
SELECT c.id FROM customers c
WHERE c.id NOT IN (SELECT customer_id FROM orders);

Join-type gotchas worth internalizing

8. Parallel Hash Join and when parallelism pays

A large join can be split across worker processes. In a Parallel Hash Join, multiple workers cooperate to build one shared hash table in dynamic shared memory, then each worker probes it with a slice of the larger input in parallel. The plan shows a Gather node on top, a Parallel Hash Join, and a Parallel Hash for the shared build. Because all workers share one hash table rather than each building a private copy, the build cost is paid once, not once per worker.

Parallelism helps a join when there is enough work to amortize the cost of starting and coordinating workers: large inputs, a Seq Scan or large index scan feeding the join, and CPU as the bottleneck rather than a tiny result. It does not help, and can hurt, on small joins where worker startup dwarfs the saved time. The planner decides based on table size against min_parallel_table_scan_size and the per-query worker ceiling from max_parallel_workers_per_gather, all under the cluster-wide max_parallel_workers. When a big analytical join is CPU-bound and runs single-threaded, raising max_parallel_workers_per_gather and confirming the table is large enough to qualify is the lever to pull.

Gather  (cost=... rows=...)
  Workers Planned: 4
  ->  Parallel Hash Join
        Hash Cond: (o.customer_id = c.id)
        ->  Parallel Seq Scan on orders o
        ->  Parallel Hash
              ->  Parallel Seq Scan on customers c

9. The misestimate that picks the wrong join

This is the dominant real-world join failure, and it ties the whole page together. The planner picks an algorithm from estimated row counts. When an estimate is badly wrong, the algorithm it picks is wrong, and the Nested Loop is where it hurts most.

Here is the chain. The planner estimates that a filter returns, say, 10 rows, so it chooses a Nested Loop, because looking up an indexed inner side 10 times is cheap. But the filter actually returns 200,000 rows. Now that same Nested Loop probes the inner side 200,000 times, and a plan the planner priced at a few milliseconds runs for minutes. Nothing is broken. The algorithm is doing exactly what it was told; it was told the wrong size.

EXPLAIN ANALYZE makes this visible because it prints estimated rows next to actual rows for every node. The signature is a node whose estimate and actual diverge by orders of magnitude, sitting under a Nested Loop with a huge loops= count on its inner side. When you see rows=10 estimated and rows=200000 actual, with loops=200000 on the inner scan, you have found it. The fix is almost never a hint. PostgreSQL has no query hints by design; the fix is to give the planner a better estimate. Run ANALYZE to refresh stale statistics. Raise the statistics target on a skewed column. And when two columns are correlated, so the planner’s independence assumption multiplies their selectivities into a far-too-small number, create extended statistics so it learns the real correlation.

-- The planner assumes city and country are independent and underestimates the match.
-- Extended statistics teach it the real correlation, fixing the row estimate.
CREATE STATISTICS addr_geo (dependencies, ndistinct)
  ON city, country FROM addresses;
ANALYZE addresses;

A join driven by a wrong estimate

EXPLAIN ANALYZE shows the outer scan estimated at rows=10 but actual rows=200000, feeding a Nested Loop whose inner Index Scan runs with loops=200000. The query the planner priced in milliseconds runs for minutes. The algorithm is fine; the size it was given is a lie.

10. Diagnosing and overriding a bad join plan

Put the diagnostics together into a routine you can run on any slow join. Start with EXPLAIN (ANALYZE, BUFFERS) and read the join nodes top to bottom against four questions.

First, is there a Nested Loop with a huge inner loops count and a divergent row estimate? That is the misestimate from rung 9. Refresh statistics, raise the per-column statistics target, or add extended statistics on correlated columns. Second, does a Hash Join show Batches greater than 1 with a Disk figure? It spilled. Raise work_mem for that query with SET LOCAL work_mem = '256MB'; inside the transaction, or shrink the build side with a tighter filter. Third, is there a Sort feeding a Merge Join that dominates the runtime? Consider an index that supplies the order for free, or check whether a Hash Join would be cheaper now that you have looked. Fourth, did an OR across two tables force a Nested Loop? A condition like a.x = b.x OR a.y = b.y cannot be served by a single hash or merge, because those need one equality key, so the planner falls back to a Nested Loop; splitting the query into a UNION of two clean equi-joins often restores a Hash Join.

As a genuine last resort, you can take the join order into your own hands. Setting join_collapse_limit = 1 tells the planner not to reorder joins at all, so it executes them exactly in the order you parenthesized them in the FROM clause. This is a manual override for the rare case where you know the right order and the planner, misled by estimates it cannot fix, keeps choosing wrong. Reach for it only after you have tried to fix the estimates, because a frozen order that is right today can rot as data shifts.

-- Last-resort manual join order. The planner will not reorder; it joins as written.
SET LOCAL join_collapse_limit = 1;
SELECT *
FROM small_filtered s
  JOIN medium m ON m.s_id = s.id      -- joined first, on purpose
  JOIN large l  ON l.m_id = m.id;     -- joined last, against the smallest possible input

Check yourself

An EXPLAIN ANALYZE shows a Nested Loop whose inner Index Scan reports loops=300000, and the node below the loop is estimated at rows=12 but actual rows=300000. What is the right first move?

Mastery Questions

A reporting query joins a 50-million-row events table to a 2-million-row sessions table on events.session_id = sessions.id, with no filter that cuts either side down much. The plan uses a Nested Loop with an Index Scan on the inner side, and it runs for eight minutes. What is almost certainly wrong, and what plan do you want instead?

Answer. A Nested Loop is the wrong algorithm for two large inputs with no selective filter. It is searching the inner side once per outer row, which here means tens of millions of index probes, and that is why it runs for eight minutes even though each individual probe is fast. You want a Hash Join: build a hash table on the smaller 2-million-row sessions side, then probe it once per event in a single linear pass over the 50 million events, so each side is read exactly once. The reason the planner chose the Nested Loop is almost always a row misestimate that made the outer side look small, so confirm it with EXPLAIN ANALYZE by comparing estimated rows to actual rows and checking the loops count on the inner scan. The fix is better statistics (run ANALYZE, raise the statistics target, add extended statistics if columns are correlated), not a hint. If the hash then spills, shown by Batches greater than 1 and a Disk figure, raise work_mem for the query. The lesson is that the algorithm was a symptom; the wrong size estimate was the disease.
You need every customer who has never placed an order. A teammate writes WHERE customer_id NOT IN (SELECT customer_id FROM orders) and it returns zero rows, even though you can see customers with no orders. Explain the bug and the fix, and say what plan each version produces.

Answer. The bug is NOT IN over a nullable column. If orders.customer_id contains even one NULL, SQL’s three-valued logic makes the NOT IN predicate evaluate to UNKNOWN for every row, because “is this id not among a set that includes an unknown value” can never be proven true. UNKNOWN is not TRUE, so every row is filtered out and you get zero rows, silently and with no error. The fix is WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id). NOT EXISTS is null-safe because it asks only whether a matching row exists, never comparing against the nulls themselves, and the planner compiles it to a clean anti join that keeps exactly the outer rows with no match. The NOT IN version, when it does run, often produces a worse plan because the planner cannot turn a null-sensitive NOT IN into an anti join. So NOT EXISTS is both correct and faster. Make it your default for exclusion.
A query joining twelve tables suddenly produces a much worse plan than the same query joining eight tables, even though the extra four tables are tiny lookup tables. What mechanism is the likely cause, and what would you check and adjust?

Answer. Twelve tables is the default geqo_threshold, so the likely cause is that the Genetic Query Optimizer kicked in. Below that threshold the planner searches join orders exhaustively (bounded by join_collapse_limit, default 8) and is guaranteed to find the cheapest order it can price; at or above it, GEQO switches to a heuristic search that finds a good order quickly but is not guaranteed to be optimal, which is why a plan can degrade right at the boundary. First confirm by checking whether the table count in the collapsed join group reached geqo_threshold. If the lookup tables are trivial, you have a few levers. You can raise geqo_threshold (and join_collapse_limit) so the planner does an exhaustive search on this query, accepting longer planning time for a better plan, which is usually worth it for a query that runs often. You can confirm with geqo = off whether the exhaustive search really finds a better plan before committing to the config change. And as a last resort, if you know the right order, set join_collapse_limit = 1 and write the joins in that order explicitly. The general point: many-way joins are where the planner trades optimality for tractability, and the thresholds that govern that trade are tunable.

Recommended next

Statistics and Row Estimation
Builds directly on this page: Statistics and Row Estimation is the next step in the PostgreSQL performance ladder.

Sources & evidence14 claims · 4 cited

Algorithm behaviour, the cost-based choice, join-order knobs, GEQO, parallel hash join, and the misestimate failure mode are grounded in the PostgreSQL 16/17 planner-statistics, EXPLAIN, parallel-query, and runtime-resource docs (src_pg_docs_planner_stats, src_pg_docs_explain, src_pg_docs_parallel_query, src_pg_docs_runtime_resource). Algorithm mechanics and SQL three-valued-logic facts those parameter-oriented docs do not state directly (e.g. how a nested loop or merge join walks its inputs, why NOT IN over a nullable column returns no rows) are marked stable-common-knowledge with empty source_ids.

A Nested Loop join scans the inner side once for each outer row, so without a useful index on the inner join key its cost is the product of the two input sizes (outer rows times inner rows), which is catastrophic on large inputs.stable common knowledge
PostgreSQL 14 added a Memoize node that caches inner-side lookups in a Nested Loop, so when the same outer join key repeats the cached inner rows are returned instead of re-probing the index.verified
A Hash Join builds an in-memory hash table on the smaller input and probes it with the larger, reading each side exactly once, so its cost is linear in the sum of the inputs rather than their product; it serves equality conditions only.stable common knowledge
A hash node's memory budget is work_mem multiplied by hash_mem_multiplier (default 2.0); when the build side exceeds it, PostgreSQL partitions both inputs into batches and spills the extra batches to temporary files, and EXPLAIN then reports Batches greater than 1 along with a Disk figure.verified
A Merge Join requires both inputs sorted on the join key and then merges them in a single linear pass; that order is free when an index already returns rows in key order, otherwise the planner inserts a Sort node below the Merge Join to produce it.stable common knowledge
The planner prices all three join algorithms with its cost model for each candidate pair of relations and keeps the cheapest, driven chiefly by the estimated sizes of the two inputs and whether either input already arrives sorted on the join key.verified
The planner reorders joins to keep intermediate results small, the number of possible orders grows factorially in the number of tables, and join_collapse_limit (default 8) caps how many FROM-list and explicit-JOIN items it flattens into one group to reorder freely while from_collapse_limit (default 8) does the same for subqueries pulled up into the parent query.verified
When the number of tables in a join group reaches geqo_threshold (default 12), the Genetic Query Optimizer takes over and searches the join-order space heuristically rather than exhaustively, trading a guarantee of the optimal order for a plan found in reasonable time.verified
A semi join (from EXISTS or IN) stops at the first inner match and emits the outer row once without multiplying by the number of matches, while an anti join (from NOT EXISTS) keeps only outer rows that have no inner match.stable common knowledge
NOT IN over a nullable column is a correctness trap: if the inner subquery returns any NULL, SQL three-valued logic makes the NOT IN predicate evaluate to UNKNOWN for every row and the query silently returns no rows, whereas NOT EXISTS is null-safe and compiles to an anti join.stable common knowledge
In a Parallel Hash Join, cooperating worker processes build one shared hash table in dynamic shared memory and then each probes it with a slice of the larger input in parallel, so the build cost is paid once rather than once per worker.verified
Whether a join runs in parallel depends on the input table size against min_parallel_table_scan_size and the per-query worker ceiling max_parallel_workers_per_gather, all bounded by the cluster-wide max_parallel_workers.verified
A row misestimate that makes a large input look small drives the planner to choose a Nested Loop that then probes the inner side hundreds of thousands of times; EXPLAIN ANALYZE exposes it as an estimated-versus-actual row divergence with a huge loops count, and the fix is better statistics (ANALYZE, a higher statistics target, or extended statistics on correlated columns), not query hints, which PostgreSQL does not provide.verified
Setting join_collapse_limit to 1 disables join reordering so the planner executes joins exactly in the order they are written and parenthesized in the FROM clause, a last-resort manual override for when estimate fixes have not corrected a bad order.verified

Cited sources

PostgreSQL Documentation: Using EXPLAIN · PostgreSQL Global Development Group
PostgreSQL Documentation: Resource Consumption (Memory) · PostgreSQL Global Development Group
PostgreSQL Documentation: How the Planner Uses Statistics · PostgreSQL Global Development Group
PostgreSQL Documentation: Parallel Query · PostgreSQL Global Development Group