Index Types Beyond B-Tree

Choosing the right PostgreSQL index type when the B-tree falls short: GIN, GiST, SP-GiST, BRIN, Hash, and pgvector, the operator classes behind them, and the failure modes engineers actually hit.

Learning outcomes

The b-tree-index-internals page taught you the index you reach for nine times out of ten. This page is about the tenth time. Once a query asks “does this jsonb contain that key?”, “does this polygon overlap that box?”, or “give me the ten vectors nearest this one”, a B-tree has nothing to offer. PostgreSQL ships five other access methods plus a vector extension, each tuned for a kind of question a B-tree cannot answer. Picking the wrong one wastes terabytes; picking the right one turns a thirty-second sequential scan into a millisecond probe.

After studying this page, you can:

  • Explain in mechanism terms why a B-tree fails for containment, set membership, geometry, full-text, and append-only ranges.
  • Match a workload to GIN, GiST, SP-GiST, BRIN, Hash, or pgvector and defend the choice with the operator set it supports.
  • Choose between operator classes (jsonb_path_ops vs jsonb_ops, brin_minmax_ops vs brin_bloom_ops, gist_trgm_ops vs gin_trgm_ops) and predict the trade-off.
  • Recognise the classic failure modes: BRIN on an unordered column, GIN with fastupdate on a write-heavy table, a giant GIN built with default maintenance_work_mem.
  • Read pg_stat_user_indexes, pg_index.indisvalid, and pgstattuple to confirm the index you built is the one queries are actually using.

Before we dive in

You should already know what a B-tree index does (the b-tree-index-internals page): it stores keys in sorted order, walks the tree for equality and range, and supports ORDER BY for free. You should also be comfortable reading an EXPLAIN ANALYZE plan and recognising an Index Scan, an Index Only Scan, and a Bitmap Index Scan (the reading-explain-analyze and scan-methods-and-when-each-wins pages cover those).

A few terms come up repeatedly. A component key is one indexable piece extracted from a composite value: every word in a tsvector, every key path in a jsonb, every element in an array. An operator class is the rulebook that tells an index type how to compare values for a given data type: it lists the operators the index supports (=, @>, &&, and so on) and the strategy numbers behind them. An access method is the index type itself (btree, gin, gist, spgist, brin, hash); every index in PostgreSQL is built by exactly one. Hold those three. Everything below is about choosing the right access method, then the right operator class on top of it.

Mental Model

The wrong model, and the one that costs the most engineer-days, is “I just need an index on this column, the type does not really matter.” Under that model a CREATE INDEX ON events (payload) on a jsonb column should make WHERE payload @> '{"plan":"pro"}' fast, because there is an index now. It does not, and the planner ignores it, because a B-tree on a jsonb can answer “is this whole value equal to that whole value”, not “does this value contain that key-value pair.”

The better model is to think of each index as a specialist that supports a specific list of operators. A B-tree supports =, <, <=, >, >=, and BETWEEN on linearly ordered scalars: nothing more. GIN supports @>, ?, ?|, ?&, @@, and the trigram variants of LIKE. GiST supports && (overlap), <@ (containment), <-> (distance for KNN), and friends. The planner only uses an index whose operator class lists the operator in your WHERE clause. So picking an index type is really picking the set of operators you want fast, and the data type is downstream of that.

Keep this picture. “Which index type” is not a property of the column; it is a property of the queries you want to answer cheaply. Two columns of the same jsonb type, one filtered with @> and one filtered with =, want different indexes (or one column wants no index at all).

Breaking it down

1. Why one index type is never enough

Start with what a B-tree actually does, because that is what every other type is reacting to. A B-tree puts whole keys in sorted order on its pages, then walks the tree from root to leaf. The sort order is total: any two keys are comparable with <. This works beautifully for scalars like bigint, timestamptz, and text because they have a natural linear order, and the same structure handles =, <, <=, >, >=, BETWEEN, and ORDER BY without extra work.

Now imagine the column is a jsonb. There is a jsonb-to-jsonb comparison (so a B-tree builds and = works), but WHERE doc @> '{"plan":"pro"}' is asking a containment question that has nothing to do with the sort order of whole documents. The B-tree cannot help: it cannot tell you “which whole documents contain this fragment” by walking a sorted list of whole documents. Same story for tsvector @@ tsquery (does the document match this search), int[] && int[] (do these two sets share any element), box && box (do these rectangles overlap), or “give me ten rows nearest this point in 2D”.

Each of these questions has a structure a different index type can exploit. Containment and set membership want one index entry per element, not per row, so a single query for “contains plan=pro” can scan just the postings list for that one element. Overlap and distance want a tree that splits space, not a totally ordered list. Append-only time series want a tiny summary index that records min and max per page range instead of one entry per row. Equality on a high-cardinality column wants an O(1) hash probe. PostgreSQL ships an access method for each.

flowchart LR
    A[Query shape] --> B{What operator?}
    B -->|"= < <= > >= BETWEEN ORDER BY"| C[B-tree]
    B -->|"@> ? ?| ?& @@ &&"| D[GIN]
    B -->|"&& <@ <-> overlap KNN"| E[GiST]
    B -->|"network prefix quad-tree radix"| F[SP-GiST]
    B -->|"range min/max on tidy column"| G[BRIN]
    B -->|"= only, very large index"| H[Hash]

The next six rungs are that chart, one type per rung, with the mechanism behind each.

2. GIN: an inverted index for things that contain things

GIN stands for Generalized Inverted Index, and “inverted” is the whole story. A B-tree indexes each whole value once. GIN does the opposite: it cracks each value into its component keys and stores one entry per component key, with a sorted list of row ids (“posting list”) under each. So a tsvector with twenty words produces twenty index entries, all pointing back at the row. A jsonb with five top-level keys produces five entries. An int[] with a hundred elements produces a hundred entries.

That inversion is what makes containment fast. WHERE doc @> '{"plan":"pro"}' becomes “look up the entry for the key path plan=pro, walk its posting list, return those rows.” The work scales with how many rows contain the searched key, not with how many rows exist. The same shape powers full-text search (tsvector @@ tsquery), jsonb existence (doc ? 'plan'), array containment and overlap (tags @> ARRAY['vip'], tags && ARRAY['vip','beta']), and trigram search through pg_trgm (name LIKE '%smith%' becomes a trigram lookup).

CREATE EXTENSION IF NOT EXISTS pg_trgm;

-- Full-text on a tsvector column
CREATE INDEX events_ts_idx ON events USING gin (search_tsv);

-- jsonb containment, the bigger and more general operator class
CREATE INDEX events_payload_gin ON events USING gin (payload);

-- jsonb_path_ops: smaller, faster, but ONLY supports @>
CREATE INDEX events_payload_pathops
  ON events USING gin (payload jsonb_path_ops);

-- Trigram search for substring LIKE
CREATE INDEX users_name_trgm ON users USING gin (name gin_trgm_ops);

The cost is real and worth respecting. GIN entries are larger than B-tree entries because there is one per component key, not one per row, so a GIN index on a wide jsonb column can easily be larger than the table. Writes are slower because every key in the new value has to find its place in a sorted posting list. To soften that, GIN keeps a pending list: new inserts go to an unsorted buffer first and are merged into the main structure later. The parameter fastupdate (default on) controls this. The hidden cost: a query that touches the pending list has to scan it linearly before walking the main index, so on a write-heavy table the pending list can grow and tail-latency of reads spikes until the next gin_clean_pending_list() or autovacuum flush.

The operator class is the second dial. jsonb_ops is the default for jsonb and supports @>, ?, ?|, ?&. jsonb_path_ops supports only @>, but it stores a single hash per key path instead of one entry per key and value, so it is typically 30 to 70 percent smaller and faster for containment-only workloads. For text similarity, gin_trgm_ops indexes every three-character substring, which is exactly what makes LIKE '%foo%' fast (a B-tree can only help with LIKE 'foo%', the left-anchored case).

Two operational notes that bite the unwary. First, building a big GIN with the default maintenance_work_mem (64 MB) is brutally slow because the build constantly spills sorted runs to disk; raise it to several GB on a build-only session before CREATE INDEX. Second, fastupdate is the right default for batch ingest with infrequent reads, but on a hot OLTP table it can be the difference between a 2 ms and a 200 ms p99: ALTER INDEX ... SET (fastupdate = off) or set a small gin_pending_list_limit so the merge happens often.

GIN operator classes side by side
Indexes every key path AND every value. Supports @>, ?, ?|, ?&. Larger, slower to build, but the only class that answers existence (?) and the multi-key variants. Use it when your queries ask both containment and existence.

3. GiST: a balanced tree for shapes, ranges, and distances

GiST stands for Generalized Search Tree, and the “generalized” is doing a lot of work. It is a balanced tree (like a B-tree) but parameterised: each operator class supplies the rules for what an internal-node key summarises, how to test “could a match be down this subtree”, and how to split a node when it overflows. The result is one tree structure that can index geometric shapes, ranges, IP addresses, full-text vectors (approximately), and anything else where the meaningful question is overlap, containment, or distance rather than “less than.”

The motivating examples explain the shape. For geometry, each internal node stores the bounding box of every leaf below it; WHERE region && search_box walks the tree, descending only into subtrees whose bounding box overlaps search_box. For range types (int4range, tstzrange), each internal node stores the union range of its children; this is what makes EXCLUDE USING GIST (room WITH =, during WITH &&) work as a no-overlap constraint. For IP prefixes through ip4r, the internal nodes store covering prefixes. For 2D points, each internal node stores a box; ORDER BY point <-> '(10,20)'::point LIMIT 10 walks the tree in distance order, pulling the ten nearest neighbours without ever scanning the table. This last trick, KNN (k-nearest-neighbour), is GiST’s signature and is genuinely unique to it among the built-in index types.

CREATE EXTENSION IF NOT EXISTS btree_gist;

-- A no-overlap constraint on reservations: same room, no overlapping intervals.
CREATE TABLE bookings (
  room int NOT NULL,
  during tstzrange NOT NULL,
  EXCLUDE USING gist (room WITH =, during WITH &&)
);

-- 2D points with KNN: ten nearest stores to a given location
CREATE INDEX stores_loc_gist ON stores USING gist (location);
SELECT id, name FROM stores
ORDER BY location <-> point '(12.97,77.59)'
LIMIT 10;

GiST is “lossy” by design. Internal nodes summarise; a positive at an internal node means “a match might be down here,” not “a match is here.” Every leaf hit needs a recheck of the original predicate against the actual tuple. That is why a GiST plan typically shows a Bitmap Index Scan feeding a Bitmap Heap Scan with a Recheck Cond, and why GiST is a fine choice for approximate full-text but loses to GIN on exact text search.

Two operator classes worth knowing. gist_trgm_ops from pg_trgm indexes trigrams in a GiST tree; GIN beats it on read latency but GiST writes faster and supports KNN on trigram similarity (ORDER BY name <-> 'smithh' LIMIT 5). The btree_gist extension adds B-tree-style equality to GiST operator classes so you can put scalar columns alongside ranges in one index, which is exactly what the EXCLUDE constraint above needs (the room = room part is not a native GiST operator and only works after CREATE EXTENSION btree_gist).

4. SP-GiST: when the data partitions itself

SP-GiST (Space-Partitioned GiST) is the same generalisation idea as GiST but with an unbalanced tree. Each internal node partitions its space into disjoint regions; the children inherit those regions and partition them further. The classic shapes are quadtrees (split a 2D plane into four quadrants at each level), kd-trees (alternate the splitting axis), and radix tries (split text by next character). Where GiST allows children to overlap and a search may descend multiple branches, SP-GiST guarantees disjoint partitions, so a search follows exactly one path down the tree.

That difference matters when the natural structure of the data is already partitioned. Network prefixes are radix-tree-shaped: 192.168.0.0/16 strictly contains 192.168.1.0/24. IP-address columns of type inet and cidr index well in SP-GiST because a prefix query (WHERE addr <<= inet '192.168.0.0/16') follows the trie down to exactly the subtree of matching prefixes. 2D points cluster well into quadtrees because most real point distributions are uneven. Text prefix matching maps to a radix trie because two strings sharing a prefix share an ancestor.

-- IP-prefix lookups: SP-GiST radix trie is the right shape
CREATE INDEX rules_addr_spgist ON firewall_rules USING spgist (addr inet_ops);

-- Quadtree on 2D points
CREATE INDEX pois_loc_spgist ON pois USING spgist (location);

-- Text prefix: spgist with text_ops handles LIKE 'smith%' and ^@ very well
CREATE INDEX users_name_spgist ON users USING spgist (name);

When does SP-GiST win over GiST or B-tree? When the data has natural disjoint partitions and the typical query asks about one region: IP prefixes, geographic regions where points cluster unevenly, hierarchical paths. When does it lose? When the data is uniform (the tree becomes unbalanced and pathological) or the queries are not partitioning-shaped (a B-tree on text still wins for = and left-anchored LIKE).

5. BRIN: a tiny index that only works on tidy data

BRIN (Block Range Index) is the most extreme trade-off in the catalogue. Instead of one entry per row, it stores one entry per range of heap pages (default 128 pages, about 1 MB of heap). Each entry records the min and max value of the indexed column across that range. To answer WHERE ts BETWEEN $1 AND $2, BRIN walks its tiny summary, identifies which page ranges could contain matching rows (whose min..max overlaps the search range), and hands those ranges to a Bitmap Heap Scan that rechecks every row.

The size is the point. A B-tree on a 200 GB time-series table runs maybe 5 to 15 GB. A BRIN on the same table is on the order of 1 to 10 MB. It fits comfortably in shared_buffers and the planner can scan all of it in microseconds. The catch is in the word “could.” If the column is strongly correlated with physical storage order, each page range has a tight min..max and BRIN eliminates almost all pages from the scan. If the column is uncorrelated, every page range has min near the global min and max near the global max, so every range “could” match, BRIN returns the whole table, and you have built a useless index that the planner will never use.

-- A 200 GB time-series table with an append-only ts column
CREATE INDEX events_ts_brin
  ON events USING brin (ts) WITH (pages_per_range = 64);

-- Less-correlated data: multi-minmax (PG14+) keeps several min/max pairs per range
CREATE INDEX events_user_brin
  ON events USING brin (user_id int8_minmax_multi_ops)
  WITH (pages_per_range = 32, values_per_range = 32);

Two parameters tune BRIN. pages_per_range (default 128) trades size for selectivity: smaller ranges mean a bigger index but tighter min..max bounds, so more pages can be eliminated. Cut it to 32 or 16 on tables where each range covers very different values; leave it at 128 or raise it for very strictly sorted columns. The brin_minmax_multi_ops and brin_bloom_ops operator classes (PostgreSQL 14+) extend BRIN to less-correlated data: minmax-multi keeps multiple disjoint min..max pairs per range (so a range with two clusters is summarised more tightly), and brin_bloom_ops stores a Bloom filter per range to answer equality on a high-cardinality column where exact min..max is not selective.

The classic failure mode is BRIN on a column that looks ordered but is not. A created_at column on an append-only log table is gold. The same column on a table that gets shuffled by row-level updates, or backfilled out of order, is worse than useless: the index exists, the planner ignores it, and you have paid a write tax on every insert and update for nothing. pg_stats.correlation (a number from -1 to 1) tells you whether a column is correlated with physical order; values close to 1 or -1 say yes, values near 0 say BRIN will not help.

BRIN vs B-tree, size as the table grows
Heap size200 GB
1 GB1024 GB
Past 100 GB: BRIN is the only index that fits in shared_buffers, but only if pg_stats.correlation is near 1 or -1

The picture to keep: a B-tree on this column would be a few percent of the heap; a BRIN with default pages_per_range is roughly 1 MB per 200 GB of heap, give or take. Move the slider to feel the gap. The savings only materialise when the column is physically sorted, which is why BRIN’s home is timestamps, bigserial ids, and partitioned-by-time fact tables.

6. Hash: equality only, and almost never the right answer

PostgreSQL has had a hash access method since the beginning, but until PostgreSQL 10 it was not WAL-logged, which meant it did not survive a crash and replicas could not see it. That defect quietly killed it as a serious option for two decades. From PostgreSQL 10 onward hash indexes are fully WAL-logged, crash-safe, and replicated, so they are real options again.

The mechanism is O(1) average-case lookup for equality: hash the key, jump to the bucket, scan a short overflow chain. That is faster than walking log2(N) levels of a B-tree only when N is very large and the saved levels are not already cached. The hash bucket pages are also unordered, so a hash index cannot answer <, >, BETWEEN, or ORDER BY: it is equality-only, no exceptions.

-- Equality-only lookups on a very large, randomly distributed key
CREATE INDEX big_table_uid_hash ON big_table USING hash (uid);

Where does it win in practice? Almost nowhere. A B-tree on the same column is usually 10 to 30 percent smaller (a hash bucket page wastes space at low fill), supports the same equality lookup, and also supports ranges and ORDER BY. The narrow win is a very large table where the key is fully random (so a B-tree gives no range locality benefit) and the index size matters, and even then the difference is often a few percent. The realistic recommendation is to default to B-tree for equality unless you have measured a real gain.

The mistake to avoid is the classic “I only do equality so I want hash.” The cost is that the next time someone writes WHERE uid > $1 the planner cannot use the index and falls back to a sequential scan, and you find out at 3am.

7. Operator classes: the dial that picks what an index can answer

Every index in PostgreSQL is the pairing of an access method (b-tree, gin, gist, spgist, brin, hash) and an operator class on each indexed column. The operator class declares which operators the index supports for that data type. When the planner considers an index, it looks up the operator class and asks: “does this WHERE clause use an operator the class supports?” If not, the index is ignored, regardless of how perfectly it matches the column.

That single rule explains a long list of “my index is not being used” outages. A CREATE INDEX ON events (payload) on a jsonb column uses the default jsonb_ops operator class for B-tree, which only supports = on the whole jsonb. The WHERE payload @> '{"plan":"pro"}' query needs @>, which is in the jsonb_ops GIN operator class, not the B-tree one. Same column, same data, different index type, different operator class, different planner outcome.

You can list which operator classes exist for any access method:

SELECT amname, opcname, opcintype::regtype AS for_type, opcdefault
FROM pg_am a JOIN pg_opclass o ON o.opcmethod = a.oid
WHERE amname IN ('btree', 'gin', 'gist', 'spgist', 'brin', 'hash')
ORDER BY amname, for_type::text, opcname;

The choices that matter most often, summarised in one table:

Access methodDefault opclass for typeAlternative worth knowing
GIN on jsonbjsonb_ops (supports @>, ?, `?, ?&`)
GIN on text (with pg_trgm)none by defaultgin_trgm_ops (LIKE %...%, similarity %)
GiST on text (with pg_trgm)none by defaultgist_trgm_ops (slower reads than GIN, faster writes, supports KNN <->)
BRIN on scalars*_minmax_ops per type*_minmax_multi_ops, *_bloom_ops (PG14+)
SP-GiST on inetinet_ops(none)
GiST on tstzrangerange_opscombine with btree_gist for EXCLUDE

The rule of thumb to internalise: when you pick an index, pick the operator class on purpose. The default is rarely wrong, but knowing the alternative is what separates “the index is built” from “the index does what I asked for at the size I can afford.”

8. pgvector: approximate nearest neighbour for embeddings

AI workloads put a new question on the table: “give me the K rows whose embedding vector is closest to this one.” The native answer is the pgvector extension, which adds a vector(N) type and two purpose-built index types: IVFFlat and HNSW. Neither is a built-in access method (pgvector registers them through the extension), but they are the right answer when the column is a high-dimensional embedding.

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE docs (id bigserial PRIMARY KEY, embedding vector(1536));

-- IVFFlat: cluster vectors into LISTS at build time, probe N at query time
CREATE INDEX docs_ivfflat ON docs
  USING ivfflat (embedding vector_cosine_ops) WITH (lists = 1000);
SET ivfflat.probes = 10;

-- HNSW: a navigable small-world graph; per-query effort tuned with ef_search
CREATE INDEX docs_hnsw ON docs
  USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);
SET hnsw.ef_search = 40;

IVFFlat partitions the vector space into lists clusters at build time using k-means; a query probes the probes clusters nearest to the query vector and scans their members. More probes mean better recall and more work. It is cheap to build but recall degrades on very high dimensions and on data that does not cluster cleanly. HNSW builds a multi-layer graph where each node has m neighbours and search descends from a top-layer entry point, widening the search frontier to ef_search candidates at query time. It costs much more to build and is larger on disk, but it gives consistently higher recall at low query latency, which is what most production AI workloads need.

The rule of thumb in 2026: start with HNSW if build time and disk size are acceptable; fall back to IVFFlat if you have hundreds of millions of vectors and need to fit the index on disk. Both indexes are approximate: they trade some recall for orders-of-magnitude latency improvement over a sequential scan that computes cosine distance row by row. Always measure recall on a held-out query set before promoting an ANN index to production.

9. A decision tree, and the failure modes that come with it

Here is the whole catalogue in one workflow. Read it from the top, stop at the first match.

flowchart TB
    A[What is your WHERE / ORDER BY?] --> B{Equality and range on a scalar?}
    B -->|yes| C[B-tree]
    B -->|no| D{Containment, existence, full-text, trigram?}
    D -->|yes| E[GIN]
    D -->|no| F{Geometry, range overlap, KNN distance?}
    F -->|yes| G[GiST]
    F -->|no| H{IP prefix, quadtree, text radix?}
    H -->|yes| I[SP-GiST]
    H -->|no| J{Append-only, correlated, very large?}
    J -->|yes| K[BRIN]
    J -->|no| L{Vector ANN?}
    L -->|yes| M[pgvector HNSW or IVFFlat]
    L -->|no| N[No index helps; rethink the query]

Now the matched set of failure modes, one per type, that you will see in real systems:

  • BRIN on an unordered column. The index builds, the planner refuses to use it, you pay write cost for no read benefit. Check pg_stats.correlation before creating BRIN; aim for abs(correlation) > 0.9. If correlation is poor, consider brin_minmax_multi_ops (PG14+) instead, or accept a B-tree.
  • GIN with fastupdate = on on a write-heavy table. Writes look fine, p99 reads jump as the pending list grows, then snap back after a flush. Symptom: read latency that oscillates on a multi-minute period. Fix: ALTER INDEX ... SET (fastupdate = off) for low-latency reads, or shrink gin_pending_list_limit to flush more often.
  • A giant GIN built with default maintenance_work_mem. The build runs for hours and the table is locked the whole time. Set maintenance_work_mem to several GB on the build session and use CREATE INDEX CONCURRENTLY to avoid the table lock.
  • A Hash index when ranges are coming. The day someone writes WHERE id > $1, the planner falls back to a Seq Scan. Default to B-tree for equality unless you have measured a real gain.
  • jsonb_ops GIN where jsonb_path_ops would do. The index is twice the size it needs to be, writes are slower, and the only queries you actually run are @>. Switch.
  • A GiST index used for exact text search. GiST is lossy and rechecks every leaf hit. For exact full-text, GIN is faster on reads.
  • pgvector ANN without recall measurement. You ship a fast nearest-neighbour search that quietly misses 30 percent of true neighbours. Always test recall against an exhaustive baseline on a sample.
The deeper traps, one click at a time

10. Diagnostics: proving the right index is doing the work

Building the right index is only half the job. You also need to confirm the planner is using it, that it is not bloated, and that it is not invalid. Three views and two extensions cover almost every question.

The first stop is pg_stat_user_indexes, which counts how often each index has been scanned, how many tuples it returned, and how many of those were live. An index with idx_scan = 0 after a week of production traffic is dead weight: nothing is using it and you are paying write cost.

SELECT schemaname, relname AS table, indexrelname AS index,
       idx_scan, idx_tup_read, idx_tup_fetch,
       pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC, pg_relation_size(indexrelid) DESC
LIMIT 20;

The second is pg_index.indisvalid, which tells you whether a CREATE INDEX CONCURRENTLY actually finished. A failed concurrent build leaves the row with indisvalid = false; the planner refuses to use the index, but every insert and update still updates it. Find these and drop them.

SELECT indexrelid::regclass AS index, indrelid::regclass AS table
FROM pg_index WHERE NOT indisvalid;

The third is pgstattuple (and its sibling pgstatindex), an extension that reports index size, free space, and dead-tuple ratio per index page. A B-tree with 40 percent free space is showing bloat; rebuild with REINDEX CONCURRENTLY. For GIN specifically, pgstatginindex reports the pending-list size, which is exactly the number you want to see to diagnose the read-latency oscillation from rung 2.

CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT * FROM pgstatindex('events_ts_brin');
SELECT * FROM pgstatginindex('events_payload_gin');

Round it out with EXPLAIN (ANALYZE, BUFFERS) on the real query. The plan tells you which index node type is used (Index Scan, Bitmap Index Scan, Index Only Scan), which operator is being matched (Index Cond: (payload @> ...)), and how many heap pages had to be visited. If you built a GIN for @> and the plan still shows Seq Scan, the operator class is wrong, the statistics are stale, or the planner thinks the query is selective enough that a Seq Scan wins.

Check yourself
You built CREATE INDEX events_ts_brin ON events USING brin (ts), but EXPLAIN ANALYZE on WHERE ts > now() - interval '1 hour' still shows a Seq Scan over 200 GB. What is the most likely root cause?

The diagnostic loop is the same for every type. Build the index. Run the workload. Read pg_stat_user_indexes to confirm scans. Read EXPLAIN ANALYZE to confirm the plan uses the operator and the size of the heap visits. Read pgstatindex or pgstatginindex to confirm the index has not bloated. If any of those three signals look wrong, the index is not earning its keep.

Mastery Questions

  1. A product team adds an events table with a payload jsonb column. Their hot query is WHERE payload @> '{"plan":"pro","status":"active"}'. They created CREATE INDEX ON events (payload) and are confused that EXPLAIN ANALYZE still shows a Seq Scan. What is the diagnosis and what would you do?

    Answer. The default CREATE INDEX uses a B-tree, which on jsonb uses the jsonb_ops B-tree operator class. That class only supports = on the whole document, not @>. The planner sees the index does not support the operator in the WHERE clause and ignores it, falling back to Seq Scan. The fix is a GIN index: CREATE INDEX events_payload_gin ON events USING gin (payload) for the default jsonb_ops class (which also supports ?, ?|, ?&), or CREATE INDEX ... USING gin (payload jsonb_path_ops) for a smaller and faster index when every query is @>. Verify with EXPLAIN ANALYZE that the plan now shows Bitmap Index Scan on events_payload_gin with an Index Cond: (payload @> '...'), and check pg_stat_user_indexes.idx_scan is climbing under load. The deeper lesson: an index type is a choice of which operators you want fast, not a property of the column type.

  2. You inherit a 400 GB metrics table that grows by 5 GB a day, sorted by (measured_at) (every insert is now()). Disk is filling and reports are slow. The current indexes are (measured_at) and (device_id, measured_at) as B-trees, totalling 60 GB. How would you re-architect the indexing, and why?

    Answer. The B-tree on (measured_at) alone is a strong candidate to replace with a BRIN. The column is perfectly correlated with physical order (every insert is at now(), so pg_stats.correlation will be near 1), the table is huge, and the typical range query asks “rows in the last hour / day / week.” A BRIN with pages_per_range = 64 or 32 will be a few MB at most, fits in shared_buffers trivially, and will eliminate almost all pages outside the range with one tiny scan. Keep the (device_id, measured_at) B-tree if reports filter by device, because BRIN cannot answer “rows for this device” (the device_id is not correlated with storage order). Pair this with partitioning by month so old months can be dropped or moved to cheaper storage; the table-partitioning concept covers that. Validate the change by reading EXPLAIN (ANALYZE, BUFFERS) before and after on representative queries, watching the heap buffer hits drop and the index size shrink from tens of GB to single-digit MB. The deeper lesson: when a column is strictly correlated with storage, BRIN is two to three orders of magnitude smaller than B-tree at the same selectivity, and on tables this size that difference is the difference between fitting indexes in RAM and not.

  3. A search feature lets users type any substring of a product name and expects sub-100 ms results: WHERE name ILIKE '%' || $1 || '%' over a 50-million-row products table. The current B-tree on lower(name) does nothing for unanchored matches. Walk through what you would build and why, including the trade-offs you would explain to the team.

    Answer. Unanchored LIKE '%...%' is a containment problem on character n-grams, not a sorted-prefix problem, so a B-tree is structurally the wrong shape. The right answer is the pg_trgm extension with a GIN index using gin_trgm_ops: CREATE EXTENSION pg_trgm; CREATE INDEX products_name_trgm ON products USING gin (name gin_trgm_ops). The extension indexes every three-character substring (trigram), and the GIN inverted lists let an unanchored ILIKE query look up the trigrams of the search term and intersect their posting lists. For 50 million rows the index will be sizeable (often 10 to 30 GB depending on average name length), so build it with maintenance_work_mem set to several GB and use CREATE INDEX CONCURRENTLY to avoid locking the table. The trade-offs to explain: writes are slower because every name change rewrites trigram entries; the index is much larger than a B-tree on the same column; if the team wants to add fuzzy matching, the same index supports the similarity operator % and ORDER BY name <-> 'query' for ranked nearest-trigram matches; if write cost is unacceptable, a GiST trigram index (gist_trgm_ops) writes faster but reads slower, which can be the right pick for write-heavy catalogues. Verify by running the real query under EXPLAIN (ANALYZE, BUFFERS) and confirming a Bitmap Index Scan on products_name_trgm with an Index Cond: (name ~~* '%...%'). The deeper lesson: every “make this LIKE query fast” problem is really “what is the structure of the matching question” plus “which operator class indexes that structure.”

Recommended next

  • Advanced Indexing Techniques
    Builds directly on this page: Advanced Indexing Techniques is the next step in the PostgreSQL performance ladder.
Sources & evidence14 claims · 5 cited

Mechanism and operator-set claims grounded in the PostgreSQL indexes/GIN/GiST/BRIN/btree documentation. Facts about pg_trgm, pgvector (IVFFlat/HNSW), SP-GiST radix structure, and GiST KNN with the <-> operator are correct and widely known but not covered by the listed source docs, so they are tagged stable-common-knowledge with empty source_ids.

  • Every index in PostgreSQL is built by exactly one access method (btree, gin, gist, spgist, brin, hash) paired with an operator class per indexed column, and the planner uses an index only when its operator class lists the operator that appears in the WHERE or ORDER BY clause.verified
  • GIN (Generalized Inverted Index) stores one entry per component key extracted from each value, with a posting list of row identifiers under each key, so it answers containment, existence, and overlap operators (@>, ?, ?|, ?&, @@, &&) without scanning the table.verified
  • GIN supports a pending list controlled by the fastupdate storage parameter (default on): inserts append to an unsorted buffer that is merged later, so a query touching the pending list scans it linearly before walking the main index, causing read tail-latency spikes on write-heavy tables until gin_clean_pending_list() or autovacuum flushes it.verified
  • For jsonb the jsonb_path_ops operator class supports only @> and stores a single hash per key path, producing an index typically 30 to 70 percent smaller and faster for containment-only workloads than the default jsonb_ops class, which additionally supports ?, ?|, and ?&.verified
  • The pg_trgm extension's gin_trgm_ops indexes every three-character substring of a text column, making unanchored LIKE '%...%' and ILIKE planner-friendly and supporting the similarity (%) and distance (<->) operators that a B-tree cannot answer.stable common knowledge
  • GiST is a balanced tree whose internal nodes hold operator-class-supplied summaries (bounding boxes for geometry, union ranges for range types, covering keys for trigrams), so it answers overlap (&&), containment (<@), and k-nearest-neighbour distance (<->) operators that have no total order.verified
  • EXCLUDE USING GIST is the canonical way to declare a no-overlap exclusion constraint on a range type, for example EXCLUDE USING gist (room WITH =, during WITH &&), and the equality part requires the btree_gist extension because = is not a native GiST operator.verified
  • GiST KNN search using ORDER BY column <-> value LIMIT k walks the tree in distance order and returns the k nearest rows without ever scanning the whole table; this distance-ordered traversal is unique to GiST among the built-in PostgreSQL access methods.stable common knowledge
  • SP-GiST (Space-Partitioned GiST) is an unbalanced tree where each internal node partitions its space into disjoint regions, naturally fitting radix tries for IP prefixes (inet/cidr with inet_ops), quadtrees on 2D points, and text radix prefix matching, with a search descending a single path rather than potentially several.stable common knowledge
  • BRIN stores one summary entry per range of heap pages (pages_per_range, default 128, about 1 MB of heap) recording the min and max of the indexed column across that range, so a BRIN on a 200 GB time-series table fits in roughly 1 MB of disk and is filtered by a Bitmap Heap Scan rechecking each candidate row.verified
  • BRIN selectivity depends entirely on physical-order correlation: the planner uses it only when the column's min..max per range is tight, so a column whose pg_stats.correlation is not close to 1 or -1 yields a BRIN that the planner refuses and writes still maintain; brin_minmax_multi_ops and brin_bloom_ops (PostgreSQL 14+) extend BRIN to less-correlated data by storing multiple min/max pairs or a Bloom filter per range.verified
  • The hash access method became fully WAL-logged and crash-safe in PostgreSQL 10 (before which it was not safely replicated); it provides O(1) average-case equality lookup but cannot answer <, >, BETWEEN, or ORDER BY, so a B-tree on the same column usually wins because it supports the same equality plus ranges and ordering.verified
  • The pgvector extension adds the vector(N) type plus two ANN index types: IVFFlat partitions vectors into lists clusters at build time and probes the nearest probes clusters at query time, while HNSW builds a multi-layer small-world graph tuned by m and ef_construction at build and ef_search at query; HNSW is the default 2026 recommendation when build cost and disk size are acceptable because it gives consistently higher recall at low latency.stable common knowledge
  • pg_stat_user_indexes reports per-index idx_scan and tuple counts (an idx_scan of 0 over a representative window means the index is unused write-tax), pg_index.indisvalid is false when a CREATE INDEX CONCURRENTLY failed midway leaving an index the planner refuses but writes still maintain, and the pgstattuple extension provides pgstatindex and pgstatginindex for index bloat and GIN pending-list size.verified

Cited sources