Reading EXPLAIN ANALYZE
How to read an EXPLAIN ANALYZE plan: the estimate-versus-measurement distinction, the option set, tree traversal order, the estimated-versus-actual-rows diagnostic, the per-loop reporting trap, BUFFERS, the per-node lines that name waste and spills, and how to find and safely explain the right query.
Learning outcomes
This page opens the query-optimization track, and it teaches the one skill the whole track rests on: reading a plan. Every tuning decision later, whether to add an index, raise work_mem, or rewrite a join, starts with a plan you can read honestly. The query-planner-and-cost-model page showed you how the planner builds costs from row estimates. Now you learn to hold the planner accountable by comparing what it predicted against what actually happened.
After studying this page, you can:
- Choose between
EXPLAINandEXPLAIN ANALYZEknowing exactly what each costs and reveals. - Read a plan tree in the right order and decode
cost,rows,width,actual time, andloopson every node. - Spot the estimate-versus-actual gap that signals a plan built on bad statistics, and trace it to its cause.
- Avoid the loops trap, where per-loop numbers hide a node that runs a hundred thousand times.
- Use
BUFFERSto see real I/O, and read the per-node lines that name wasted work and on-disk spills. - Find which slow query to explain in the first place, and explain a write statement without corrupting data.
Before we dive in
You should be comfortable writing SQL and you should have read the query-planner-and-cost-model page, because this page assumes you know that a plan is a tree of nodes and that each node carries an estimated cost in abstract units. If “the planner picks the lowest estimated total cost” is not yet obvious, read that page first.
A few terms, defined as we use them. A plan node is one operation in the tree: a scan, a join, a sort, an aggregate. A scan reads rows from a table or index. Cardinality is a fancy word for a row count, estimated or actual. A buffer is one 8 KB page held in PostgreSQL’s shared memory cache; reading a buffer that is already cached is cheap, reading one from disk is not. work_mem is the per-operation memory budget for sorts and hashes; when an operation needs more than its budget, it spills to temporary files on disk. Hold those five. Everything below is built from them.
Mental Model
The tempting wrong model is that EXPLAIN ANALYZE is just EXPLAIN with a timer bolted on, a slightly more detailed estimate. Under that model you read the same numbers and trust them the same way.
That is backwards in the way that matters most. EXPLAIN shows you only the planner’s beliefs: every number is a prediction made before a single row was touched. EXPLAIN ANALYZE actually runs the query and writes the real outcome next to each prediction. So the right model is a courtroom, not a fancier crystal ball. The cost and rows columns are the planner’s testimony about what it expected. The actual time and actual rows columns are the evidence of what really happened. Your whole job reading a plan is to put the testimony beside the evidence and find where they disagree. The biggest disagreement, almost always, is where your query is slow.
Keep this picture. You are not admiring an estimate. You are cross-examining one against reality, and the gap between them is the lead you follow.
Breaking it down
1. EXPLAIN guesses, EXPLAIN ANALYZE measures
Start with the difference that everything else depends on. EXPLAIN alone asks the planner to choose a plan and print it, then stops. The query never runs. So EXPLAIN is instant and safe on any statement, but every number it shows is an estimate: cost, row counts, widths, all of them predictions from the planner’s model.
EXPLAIN ANALYZE does something fundamentally different. It plans the query, then executes it, timing and counting along the way, and prints the real measurements beside the estimates. That extra word, ANALYZE, turns a prediction into a measurement. It also means the query truly runs: a slow query takes its full time, and a write statement really writes. We come back to that hazard in rung 8.
-- Estimates only. The query does not run. Safe anywhere, instant.
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;
-- Plans AND runs the query, then prints measured rows and time per node.
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42;
The single most common beginner mistake is reading EXPLAIN output, seeing a scary cost=0.00..18500.00, and concluding the query is slow. That cost is an abstract number anchored to seq_page_cost = 1.0, as the query-planner-and-cost-model page covers. It is not milliseconds. It is not even guaranteed to be right. Only ANALYZE tells you what actually took time, and that is the number a slow query is judged on.
2. The option set, and the one incantation to memorize
EXPLAIN takes a parenthesized option list, and a handful of those options are worth knowing cold. Here is each one and the job it does.
| Option | What it adds | Default |
|---|---|---|
ANALYZE | Runs the query, adds measured actual time, actual rows, loops | off |
BUFFERS | Per-node shared and temp buffer counts (the real I/O) | off without ANALYZE; on with ANALYZE in PG 16+ |
VERBOSE | Output column lists, schema-qualified names, worker detail | off |
COSTS | The estimated cost and rows columns | on |
TIMING | Per-node timing (costs almost nothing to turn off) | on when ANALYZE is on |
SETTINGS | Any planner-related setting changed from its default | off |
WAL | Write-ahead-log records and bytes generated (writes only) | off |
FORMAT | Output as text, json, yaml, or xml | text |
Two notes that catch people. First, in PostgreSQL 16 and later, BUFFERS turns on automatically whenever ANALYZE is on, because the project decided buffer counts are too useful to leave off; on older versions you must ask for it. Second, COSTS and TIMING are on by default, so you normally do not type them; you type TIMING OFF only to dodge slow-clock overhead, which rung 8 explains.
The incantation to keep in muscle memory is this:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE, SETTINGS)
SELECT ...;
Each word earns its place. ANALYZE gives you reality instead of a guess. BUFFERS shows the actual I/O, so you can tell a cache hit from a disk read and see an on-disk spill. VERBOSE names the output columns and the worker breakdown, so a wide node or a parallel imbalance is legible. SETTINGS prints any planner knob that is not at its default, which is how you catch the colleague who set random_page_cost = 1.1 in this session and changed every plan. That single line answers most “why is this slow” questions before you ask a second one.
3. Reading the plan tree: inside-out, bottom-up
A plan is a tree, and the text format draws it with indentation. The trick that unlocks everything is reading it in the right order: a plan executes inside-out and bottom-up. The most-indented node runs first and feeds its parent; the top line is the last thing to finish and produces the final rows.
So you read a plan like nested function calls. The innermost, most-indented node is the deepest call. Its output flows up to the node it is indented under, which processes it and flows up again, until the topmost node returns the result. Children that share a parent are siblings: a join has two children, the outer (upper) and the inner (lower).
flowchart TB
A["Hash Join (top, runs last)"] --> B["Seq Scan on orders (outer child)"]
A --> C["Hash (inner child)"]
C --> D["Seq Scan on customers (runs first, deepest)"]Read that bottom to top: scan customers, build a hash from it, scan orders, probe the hash for each order, emit joined rows at the top. Now decode the numbers on a single node. Every node prints a cost line, and with ANALYZE an actual line right after it:
Seq Scan on orders (cost=0.00..18500.00 rows=1000 width=64)
(actual time=0.025..82.140 rows=998 loops=1)
Six numbers, six meanings. On the estimate line, cost=0.00..18500.00 is startup cost (work before the first row, here zero because a sequential scan streams immediately) then total cost (work to the last row). rows=1000 is the estimated row count. width=64 is the estimated average row width in bytes. On the actual line, actual time=0.025..82.140 is real milliseconds, startup then total, for this node. rows=998 is the measured row count. loops=1 is how many times this node was executed; hold that word, rung 5 is entirely about it. Reading a plan is reading these six numbers on every node and asking, at each one, does the testimony match the evidence.
4. The master diagnostic: estimated rows versus actual rows
Here is the single most valuable habit in plan reading, the one that separates guessing from diagnosing. On every node with ANALYZE, compare the estimated rows to the actual rows. When they are close, the planner understood your data and probably chose a reasonable plan. When they diverge by orders of magnitude, the plan was built on a lie, and that node is where to start.
Why does it matter so much? Because the planner chooses every join method, every scan type, and every sort strategy based on those row estimates. The query-planner-and-cost-model page showed how a row count drives cost. If the planner thinks a scan returns 10 rows, it will happily put it on the inner side of a nested loop and probe it again and again, which is cheap for 10 rows and catastrophic for 100,000. The estimate is the foundation; when it is wrong, every decision built on it is wrong too.
Look at a real divergence:
Index Scan using orders_status_idx on orders
(cost=0.43..8.45 rows=10 width=64)
(actual time=0.031..42.910 rows=98214 loops=1)
Index Cond: (status = 'pending')
The planner estimated rows=10. Reality was rows=98214. That is a four-orders-of-magnitude miss, and it is not a rounding error; it is a broken assumption. The cause is almost always the statistics behind the estimate. Either ANALYZE has not run since the data changed, so the planner is reasoning about a table that no longer exists, or the estimate is correlation-blind: it multiplied two selectivities assuming the columns are independent when they are not. The statistics-and-row-estimation page covers exactly how those numbers are formed and how to fix them with fresher or extended statistics. For now, internalize the reflex: a big estimate-versus-actual gap means the statistics are the problem, and no amount of staring at the join method will help until you fix the foundation.
flowchart LR
A["estimated rows = 10"] --> B["planner picks Nested Loop"]
C["actual rows = 98214"] --> D["inner side probed 98214 times"]
B --> E["plan tuned for a handful of rows"]
D --> F["catastrophic runtime"]
E --> F5. The loops trap: per-loop numbers that lie about totals
This is the rung experts still trip over, so read it slowly. When a node sits on the inner side of a nested loop, its parent calls it once per outer row. The loops count records how many times. And here is the trap: on that node, actual time and actual rows are reported per loop, averaged, not totaled. To get the true totals you multiply by loops yourself.
Watch how this hides an expensive node:
Nested Loop (cost=0.43..98421.00 rows=10 width=128)
(actual time=0.052..9210.441 rows=98214 loops=1)
-> Seq Scan on customers
(actual time=0.018..12.300 rows=98214 loops=1)
-> Index Scan using orders_cust_idx on orders
(cost=0.42..9.84 rows=1 width=64)
(actual time=0.071..0.088 rows=1 loops=98214)
Index Cond: (customer_id = customers.id)
Glance at the inner index scan and it looks innocent: actual time=0.071..0.088, under a tenth of a millisecond, rows=1. You might skip right past it. But loops=98214. That node ran 98,214 times. Its true contribution is roughly 0.088 ms * 98214 = 8642 ms, nearly nine seconds, which is exactly where the Nested Loop’s total of 9210 ms came from. The cheap-looking node is the whole problem, and the only sign is the loops count.
The misreading goes the other way too. Someone sees rows=1 on the inner node and concludes “this index scan returns one row, it is fine,” then reports the inner scan as returning one row total. It returned one row per loop, 98,214 rows in all. Always do the multiplication in your head: per-loop number times loops equals the truth. Tools like the depesz plan analyzer and the Dalibo plan visualizer do this multiplication for you and color the heaviest node, which is one good reason to paste a plan into them.
6. BUFFERS: what each node actually touched
BUFFERS answers a question costs cannot: how much data did this node really move, and where did it come from? Costs are estimates; buffer counts are measured page touches. Turn BUFFERS on always, and on PostgreSQL 16 and later you get it free with ANALYZE.
A buffer is one 8 KB page. The Buffers line breaks page touches into categories that each tell a different story:
shared hitis a page found already in the buffer cache. Cheap, no disk involved.shared readis a page fetched from disk (or the OS cache) into a buffer. This is your real read I/O.shared dirtiedis a page this statement modified, marking it for a later write.shared writtenis a dirty page this statement had to flush itself, often because the cache was under pressure.temp readandtemp writtenare temporary-file pages: an on-disk spill because an operation outgrewwork_mem. Temp blocks are a red flag for sorts and hashes.
Read a buffer line as a quantity. Each page is 8 KB, so the count times 8 KB is data volume:
Seq Scan on events
(actual time=0.030..1840.220 rows=2100000 loops=1)
Buffers: shared hit=4096 read=258048
That node touched 4096 + 258048 = 262144 buffers, which is 262144 * 8 KB = 2 GB, and only 4096 of those pages were cached. You just learned that this scan dragged 2 GB off disk, which is almost certainly why it took 1.8 seconds, and that the table does not fit in cache. No cost number tells you that. When you see temp read and temp written on a Sort or a Hash node, you have found a spill: the operation needed more memory than work_mem allowed and went to disk, which is one of the most common and most fixable causes of a slow query.
7. The per-node lines that name the waste
Beyond the cost and actual lines, nodes print extra lines that name specific work, and several of them point straight at a fix. Learn to scan for these.
The pattern across all of these: the plan is telling you, in plain text, where it is doing work you did not want. Rows Removed by Filter is a missing index. external merge Disk and Batches: 8 are work_mem spills. A high Heap Fetches is a stale visibility map. You do not have to guess; you read the line that names the waste and fix that.
8. Operational cautions: writes, slow clocks, and finding the query
Three practical hazards stand between you and a clean plan reading. None is hard, but each has bitten experienced engineers.
First, the dangerous one: EXPLAIN ANALYZE executes the statement. On a SELECT that is merely slow. On an INSERT, UPDATE, or DELETE it really writes, and on production that is a data change you did not intend. The safe pattern is to wrap the write in a transaction you roll back:
BEGIN;
EXPLAIN (ANALYZE, BUFFERS)
UPDATE accounts SET balance = balance - 10 WHERE id = 42;
ROLLBACK;
The plan runs, you get real measurements, and the ROLLBACK undoes the change. Forget the BEGIN ... ROLLBACK and you have run a real UPDATE in production to read a plan. This is a classic, avoidable incident.
Second, timing overhead. EXPLAIN ANALYZE calls the system clock twice per row to measure per-node time. On hardware with a slow clock source, that overhead can dwarf the query and skew the numbers. Check your clock with pg_test_timing; if a single timing call costs hundreds of nanoseconds, add TIMING OFF to drop the per-row clock reads while keeping ANALYZE’s row counts and totals:
EXPLAIN (ANALYZE, BUFFERS, TIMING OFF)
SELECT ...;
Third, finding which query to explain at all. You cannot explain a query you have not noticed. Two tools surface the offenders. The pg_stat_statements extension aggregates every executed statement with its call count and timing; order by total_exec_time to find the queries eating the most wall-clock across the system:
SELECT query, calls, total_exec_time, mean_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
The auto_explain module logs the plan of any statement slower than a threshold, automatically, with no one watching. Set auto_explain.log_min_duration to a millisecond cutoff and every slow query logs its full plan, so the next time a query crosses the line you already have its EXPLAIN ANALYZE waiting in the log:
# postgresql.conf
auto_explain.log_min_duration = '500ms'
auto_explain.log_analyze = on
auto_explain.log_buffers = on
Once you have a plan, pasting its text into a plan visualizer such as depesz or Dalibo gives you a colored, sortable view that highlights the heaviest node and does the loops multiplication for you. They are reading the same text you now can; they just make the worst node jump out.
9. A full worked example: diagnosing a slow plan
Put it all together on one plan. A report query is slow, you wrap it in nothing dangerous because it is a SELECT, and you run the incantation:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE, SETTINGS)
SELECT c.name, sum(o.total)
FROM customers c
JOIN orders o ON o.customer_id = c.id
WHERE c.region = 'EMEA' AND o.status = 'shipped'
GROUP BY c.name;
GroupAggregate (cost=0.43..512840.10 rows=200 width=40)
(actual time=12.4..14820.7 rows=14200 loops=1)
-> Sort (cost=0.43..511200.00 rows=200 width=12)
(actual time=12.3..9980.2 rows=2010000 loops=1)
Sort Key: c.name
Sort Method: external merge Disk: 412800kB
Buffers: shared hit=2048 read=180224, temp read=51600 written=51600
-> Nested Loop (cost=0.43..510800.00 rows=200 width=12)
(actual time=0.06..6120.4 rows=2010000 loops=1)
-> Seq Scan on customers c
(cost=0.00..3200.00 rows=80000 width=36)
(actual time=0.02..40.1 rows=80000 loops=1)
Filter: (region = 'EMEA')
-> Index Scan using orders_cust_idx on orders o
(cost=0.42..6.30 rows=1 width=12)
(actual time=0.02..0.06 rows=25 loops=80000)
Index Cond: (customer_id = c.id)
Filter: (status = 'shipped')
Rows Removed by Filter: 60
Read it bottom-up and the diagnosis writes itself. Start at the inner Index Scan: estimate rows=1, actual rows=25, and loops=80000. That is the estimate gap and the loops trap together. The planner believed each customer had one matching order, so it chose a Nested Loop; in reality each had 25, and the inner scan ran 80,000 times producing about 2 million rows. The Nested Loop total of 6120 ms is the cost of that misjudgment. The estimate-versus-actual gap (rows=1 versus 25 per loop, rows=200 versus 2,010,000 at the join) tells you the statistics on orders are wrong: ANALYZE is stale, or the status and customer_id selectivities were multiplied as if independent. That is the root cause, and the statistics-and-row-estimation page is where you fix it.
Now climb to the Sort. Sort Method: external merge Disk: 412800kB and temp read=51600 written=51600: this sort spilled about 400 MB to disk because 2 million rows did not fit in work_mem. That is the second wound, and it is downstream of the first: the planner sized the sort for 200 rows, got 2 million, and had nowhere to put them. The Buffers line shows read=180224 pages, about 1.4 GB pulled from disk, confirming this query is I/O-bound on top of everything.
So the fix has an order. First, refresh the statistics on orders (and consider extended statistics on the correlated customer_id and status) so the planner stops believing one order per customer; with honest estimates it will likely choose a Hash Join over the Nested Loop and never probe 80,000 times. Second, with the row count understood, the spill may resolve on its own, or a higher work_mem for this report keeps the sort in memory. You found the expensive node (the Nested Loop driven by the inner scan), the estimate gap (rows=1 versus 25, cascading to rows=200 versus 2,010,000), and the spill (external merge Disk: 412800kB), and you stated the fix: fix the statistics first, then the memory. That is the whole skill.
Mastery Questions
-
You run
EXPLAIN(withoutANALYZE) on a reporting query and seecost=0.00..847200.00. A teammate says “that means it takes 847 seconds, we have to fix it.” Are they right, and what should you actually do?Answer. They are wrong, and the mistake is reading an estimate as a measurement. The cost
847200.00is an abstract number in the planner’s own units, anchored toseq_page_cost = 1.0, as the query-planner-and-cost-model page explains; it is not milliseconds or seconds, and it is only the planner’s prediction, which may itself be wrong.EXPLAINwithoutANALYZEnever ran the query, so you have no measured time at all. What you should do is runEXPLAIN (ANALYZE, BUFFERS)to actually execute it and get realactual timenumbers, then compare estimatedrowsto actualrowson each node to see whether the high cost reflects a genuinely big query or a misestimate. The high cost might be entirely justified (the query really does scan a billion rows) or it might come from a row estimate that is wildly off, and onlyANALYZEdistinguishes those. The lesson: cost is a prediction in abstract units, not a runtime, and the only number a query is judged on is measured time. -
An inner node of a nested loop shows
actual time=0.040..0.060 rows=2 loops=120000and looks trivially fast, yet the query takes nine seconds. Where did the time go, and how do you read this node correctly?Answer. The time went into this exact node, and the per-loop reporting hid it. On the inner side of a nested loop,
actual timeandactual rowsare reported per loop, averaged across all executions, not as totals. This node ranloops=120000times. Its true time contribution is roughly0.060 ms * 120000 = 7200 ms, about seven seconds, which is most of the nine-second runtime, and it truly emitted2 * 120000 = 240000rows, not 2. Reading the per-loop0.060 msas a total is the loops trap, and it makes the single most expensive node in the plan look like the cheapest. The correct reading is always to multiply the per-loop time and rows byloops. The deeper fix is usually upstream: a node probed 120,000 times almost always means the planner underestimated the outer side’s cardinality and chose a Nested Loop where a Hash Join would have been far better, so you trace the estimate gap and fix the statistics that caused it. -
A query’s plan shows a
Sortnode withSort Method: external merge Disk: 380000kBand aBuffersline readingtemp read=47500 written=47500. What is happening physically, what is the immediate cause, and what are your options?Answer. Physically, the sort did not fit in memory and spilled to temporary files on disk.
Sort Method: external mergeis PostgreSQL’s name for a disk-based merge sort, the slow path it falls back to when an in-memory quicksort would exceed its memory budget. TheDisk: 380000kBconfirms about 380 MB went to temp files, and thetemp read=47500 written=47500buffer counts (each buffer is 8 KB, so about 370 MB written and read back) are the I/O of that spill. The immediate cause is that the number of rows being sorted, times their width, exceededwork_memfor this operation. Your options, in order: first check whether the row count is even correct by comparing estimatedrowsto actualrowson the nodes feeding the sort, because a spill is often the downstream symptom of a misestimate that produced far more rows than expected, in which case fixing the statistics shrinks the sort. If the row count is genuinely large, raisework_memfor this query or session so the sort stays in memory (an in-memory quicksort is dramatically faster than an external merge), or reduce the rows reaching the sort with a better index or an earlier filter. The point to internalize:external mergeand temp blocks are always a spill, and a spill is always either too little memory or too many rows, and you check the row estimate before you reach for more memory.
Sources & evidence15 claims · 3 cited
EXPLAIN option semantics, the cost/rows/width and actual-time/rows/loops output format, per-loop reporting on the nested-loop inner side, BUFFERS categories, and the per-node lines (Sort Method, Hash Batches, Rows Removed by Filter, Heap Fetches, Recheck Cond, Workers Launched, Memoize) are grounded in src_pg_docs_explain. pg_stat_statements ordering is grounded in src_pg_docs_pgstatstatements. The estimate-gap-to-statistics causal link is grounded in src_pg_docs_planner_stats. Default cost constants, the PG16 BUFFERS-with-ANALYZE default, auto_explain settings, pg_test_timing, and the worked-example arithmetic are stable common knowledge of PostgreSQL 16/17 behavior not pinned to a single cited paragraph in the allowed sources.
- EXPLAIN displays the planner-chosen plan and its estimated cost, rows, and width without executing the statement, while EXPLAIN ANALYZE actually runs the statement and adds measured actual time, actual rows, and loops to each node.verified
- EXPLAIN accepts the options ANALYZE, BUFFERS, VERBOSE, COSTS, TIMING, SETTINGS, WAL, and FORMAT (text, json, yaml, xml), with COSTS and TIMING on by default and TIMING reportable per node only when ANALYZE is in effect.verified
- In PostgreSQL 16 and later, the BUFFERS option defaults to on whenever ANALYZE is used, so buffer counts appear without explicitly requesting them; on earlier versions BUFFERS must be requested explicitly.stable common knowledge
- A plan executes inside-out and bottom-up: the most-indented child node runs first and feeds its parent, and the topmost node finishes last and produces the final result set.verified
- Each plan node prints cost=startup..total, estimated rows, and estimated width in bytes, and with ANALYZE adds actual time=startup..total in milliseconds, actual rows, and a loops count of how many times the node executed.verified
- The primary diagnostic is comparing a node's estimated rows to its actual rows; a divergence of orders of magnitude (for example an estimate of 10 rows against an actual 98214) means the plan rests on bad statistics rather than a flawed plan choice.verified
- A large estimated-versus-actual row gap traces to the per-column statistics behind the estimate: either ANALYZE has not run since the data changed, or the estimate is correlation-blind because the planner multiplied selectivities assuming independent columns, fixable with fresher or extended statistics.verified
- On the inner side of a nested loop the node's reported actual time and actual rows are per-loop averages, not totals, so the true contribution is the per-loop value multiplied by loops; a node showing actual time 0.088 with loops 98214 really cost about 8642 ms.verified
- The BUFFERS Buffers line reports shared hit (page found in cache), shared read (page fetched from disk), shared dirtied, and shared written, plus temp read and temp written for temporary-file pages; each buffer is an 8 KB page, so a count of 262144 buffers is about 2 GB touched.verified
- Nonzero temp read and temp written buffers on a Sort or Hash node indicate an on-disk spill, meaning the operation exceeded work_mem and wrote temporary files rather than completing in memory.verified
- A Sort node reports Sort Method quicksort with a Memory figure when it fit in work_mem, or Sort Method external merge with a Disk figure when it spilled; a Hash node with Batches greater than 1 has spilled its build side to disk, and a Bitmap Heap Scan reports a Recheck Cond with lossy heap blocks when the bitmap degraded to page granularity under memory pressure.verified
- Rows Removed by Filter counts rows read and discarded by a node's filter, Heap Fetches on an Index Only Scan counts rows that still needed a heap visit because the visibility map did not confirm them all-visible, and a Gather node reports Workers Planned versus Workers Launched where fewer launched than planned means less parallelism than the plan was costed for.verified
- Because EXPLAIN ANALYZE executes the statement, running it on an INSERT, UPDATE, or DELETE performs the real write, so such statements should be wrapped in BEGIN ... ROLLBACK to measure the plan without persisting the change.verified
- EXPLAIN ANALYZE reads the system clock twice per row for per-node timing, so on hardware with a slow clock source (checkable with pg_test_timing) the overhead can dominate; setting TIMING OFF drops the per-row clock reads while retaining ANALYZE's row counts and totals.stable common knowledge
- The pg_stat_statements extension aggregates executed statements with call counts and timing, and ordering its rows by total_exec_time descending surfaces the queries consuming the most cumulative execution time across the system.verified
Cited sources
- PostgreSQL Documentation: Using EXPLAIN · PostgreSQL Global Development Group
- PostgreSQL Documentation: How the Planner Uses Statistics · PostgreSQL Global Development Group
- PostgreSQL Documentation: pg_stat_statements · PostgreSQL Global Development Group