Ledger Design — Explainarium

Learning outcomes

Double-entry tells you what a single transaction must look like: balanced legs, value moved and never minted. A production ledger is the machine that records billions of those transactions, serves a balance in a millisecond, never loses a cent to a race condition, survives a crash mid-write, and lets an auditor reconstruct any account as of any past instant. The principle is five hundred years old. The engineering is the part that keeps people up at night, and it is what this page is about.

The goal here is to take you from understanding the accounting rule to designing the system that enforces it at scale, under load, across currencies, behind a service boundary, without ever betraying the invariant that money is conserved.

After studying this page, you can:

Design a chart of accounts and an account hierarchy, and explain why account identity, type, and currency are fixed at creation and never mutated.
Lay out the transactions-and-postings data model that almost every serious ledger converges on, and say what each column is load-bearing for.
Justify, precisely, why postings are immutable and append-only, and how a correction is made without ever rewriting history.
Distinguish a balance that is derived from postings from one that is materialized for speed, and design a materialized read model that can never become the source of truth.
Walk an entry through its lifecycle states (pending, posted, reversed, and the hold or authorization that reserves funds before they move).
Explain how idempotency keys, atomic multi-leg writes, and a chosen consistency model together stop double charges, partial writes, and drift.
Reason about multi-currency and multi-asset ledgers without ever summing two units that are not the same thing.
Scale a ledger from one table on one database to a partitioned, sharded, service-owned system, and name the cost each step adds.
Describe the same ledger from the seat of an engineer, an operator, and an auditor, and see why one design has to satisfy all three.

Before we dive in

This page assumes you already understand double-entry: that every transaction is recorded as at least two postings of equal value in opposite directions, that the books must balance, and that a debit is a direction (the left side of an account), not a decrease. If those words feel slippery, study double-entry first; here we build the system on top of it.

A few terms, each defined the first time it appears. An account is a named bucket of value with a fixed type and currency: a customer’s wallet, a fee account, the cash you hold at a bank. A transaction is one economic event recorded as a balanced group of postings. A posting (also called an entry or a leg) is one signed movement of value into or out of one account. A balance is the net of an account’s postings. An invariant is a property the system must hold true at all times, for example “debits equal credits in every transaction.” Idempotent means doing the same operation twice has the same effect as doing it once. Atomic means a group of writes either all commit or none do, with no partial state in between.

Two conventions. Amounts appear in dollars for readability, but a real ledger stores money as an integer count of the smallest currency unit (cents for dollars), never as a decimal float, because floats lose pennies and a ledger that loses pennies fails its own balance check. And when we say a balance is “derived,” we mean it is a query over the postings, not a number we overwrite. Hold those two ideas; everything below leans on them.

Mental Model

The wrong model is that a ledger is a table of account balances that you read and update, like any other database table with a balance column you increment. It feels obvious: an account has a balance, a payment changes it, so you write the new number. Most engineers reach for this on day one, and it is the single most expensive mistake in financial software.

The reason it fails is that a balance updated in place is the only record of the truth, and that record has no memory and no defense. It cannot tell you how it reached its current value, so when it is wrong you cannot reconstruct what went wrong. Two concurrent payments can read the same starting balance and both write back, and one silently vanishes. And it has nothing to check itself against, which throws away the entire self-checking power of double-entry.

Here is the model to hold instead. A ledger is not a table of balances. It is an append-only log of immutable movements, and a balance is a fold over that log: the sum of an account’s postings, computed on demand. You never change the past; you only append to it. Think of it like a bank statement that grows by one line per event and is never erased, where today’s balance is simply the running total of every line so far. The log is the truth. The balance is a question you ask the log. Every design decision below (immutability, idempotency, materialized read models, sharding) is in service of keeping that log correct, complete, and conserved, while still answering “what is the balance” fast enough to charge a card.

Breaking it down

The teaching runs in thirteen steps. The first five build the core data model and the immutable, derived-balance design. The middle four handle the lifecycle, correctness, and currency problems that production forces on you. The last four scale the system out and look at it through the eyes of the people who depend on it.

1. What a production ledger has to guarantee

Before any schema, fix the contract. A ledger is the system of record for money, and it makes a small set of promises that nothing is allowed to break. Naming them up front tells you which engineering choices are non-negotiable and which are merely convenient.

First, conservation: every transaction’s signed legs sum to zero, so the system can never create or destroy value by recording a transaction. Second, completeness and immutability: every movement that ever happened is recorded and never erased, so history is reconstructable and tamper-evident. Third, exactly-once application: a single real-world event produces exactly one transaction, even when clients retry. Fourth, atomicity: a transaction’s legs all commit together or not at all, so a debit can never outlive its credit. Fifth, a fast, correct balance: the system can answer “what is this account’s balance” cheaply, even though the truth is a potentially enormous log.

Notice the tension already. The truth is an ever-growing log (good for audit, bad for fast reads), and the answer everyone actually wants is a single current number (fast to read, useless for audit). The whole craft of ledger design is keeping both: an immutable log as the source of truth, and a derived or materialized view for speed, with an unbreakable rule about which one wins when they disagree. The log always wins.

The five guarantees, and what each one rules out

Everything in the rest of the page is an implementation detail in service of one of these five promises. When a design choice is hard, ask which guarantee it protects; if it protects none of them, it is probably optional.

2. The chart of accounts and account hierarchies

An account is the unit a balance attaches to, and the chart of accounts is the full, governed catalog of every account the ledger knows. In a personal accounting tool the chart is a handful of accounts. In a fintech it is a structured, growing set: every customer wallet, every fee and revenue account, every settlement and float account at every bank, every suspense and clearing account used to park value mid-flight.

Three properties of an account are fixed at creation and never mutated: its identity (a stable id), its type (asset, liability, equity, revenue, or expense, which fixes whether a debit increases or decreases it), and its currency or asset (you do not change a USD account into a EUR account; you open a new one). They are immutable because the entire ledger of postings is interpreted relative to them. If you could flip an account’s type from liability to asset after postings exist, every historical balance computed from those postings would silently change meaning. The account’s mutable fields are the soft ones: a display name, a status (open, frozen, closed), metadata.

Accounts form a hierarchy, a tree where a parent account’s balance is the sum of its children. This is what lets a single ledger answer at every altitude: the balance of one customer’s USD wallet, the total of all customer USD wallets, the total of all customer balances across currencies converted to a reporting currency. The leaves are where postings land; the internal nodes are roll-ups you never post to directly.

flowchart TB
  root["All accounts"]
  root --> assets["Assets"]
  root --> liab["Liabilities"]
  assets --> bankUSD["Bank float USD"]
  assets --> bankEUR["Bank float EUR"]
  liab --> custUSD["Customer wallets USD"]
  liab --> custEUR["Customer wallets EUR"]
  custUSD --> w1["Wallet alice USD"]
  custUSD --> w2["Wallet bob USD"]
  custEUR --> w3["Wallet alice EUR"]

Two account-design choices separate a toy from a production ledger. First, account-per-customer-per-currency rather than one row with many balances: a customer who holds dollars and euros has two distinct leaf accounts, because a balance is only meaningful within one currency. Second, the deliberate use of internal accounts that hold no customer’s money but make every transaction balanced and traceable: a clearing account where a card payment sits between authorization and settlement, a suspense account for value you have received but cannot yet attribute, a fee-revenue account that the customer’s lost cents flow into. These internal accounts are not bureaucracy; they are how you keep every flow double-entry even when the real-world money is mid-flight.

Account kinds you will actually open

A liability account, one per customer per currency. It is money you owe the customer. It increases with a credit when they deposit and decreases with a debit when they withdraw. The leaf where most postings land.

3. The transactions and postings data model

Now the heart of the system. Almost every serious ledger converges on the same two-table shape: a transactions table (the balanced group, the economic event) and a postings table (the individual legs). The postings table is the source of truth; the transactions table groups legs and carries the idempotency key and metadata.

-- One row per economic event. Groups its legs, carries idempotency and metadata.
CREATE TABLE transactions (
  id              BIGINT PRIMARY KEY,
  idempotency_key TEXT UNIQUE NOT NULL,   -- exactly-once: a repeat key returns the same txn
  status          SMALLINT NOT NULL,      -- pending, posted, reversed
  reverses_id     BIGINT,                 -- if this txn reverses another, points at it
  created_at      TIMESTAMPTZ NOT NULL,
  metadata        JSONB
);

-- The truth: an append-only list of balanced legs. Never updated, never deleted.
CREATE TABLE postings (
  id            BIGINT PRIMARY KEY,
  transaction_id BIGINT NOT NULL REFERENCES transactions(id),
  account_id    BIGINT NOT NULL,
  direction     SMALLINT NOT NULL,  -- +1 debit, -1 credit
  amount        BIGINT NOT NULL,    -- minor units (cents), always positive, never a float
  currency      CHAR(3) NOT NULL,   -- must match the account's currency
  created_at    TIMESTAMPTZ NOT NULL
);

Every column earns its place. The transaction_id groups the legs so the balance check (signed legs sum to zero) runs per transaction. The direction is a sign, kept separate from the always-positive amount so the arithmetic stays unambiguous and a balance is a clean SUM(direction * amount). The amount is an integer of minor units, never a float, for the exactness reasons in the prerequisite. The currency is denormalized onto each posting so a posting can never be silently summed against a different unit. And created_at plus a monotonic id give you the ordering you will need for replay and for “balance as of” queries.

The mistake to avoid here is collapsing the two tables into one row that stores both legs as columns (a from_account, to_account, amount). It looks tidier, and it works for the simplest two-leg transfer, but it cannot represent a transaction with three or more legs (a payment split into a net amount, a fee, and a tax), and it hides the symmetry that makes the balance check trivial. The general, future-proof shape is one row per leg, grouped by transaction.

A $100 deposit with a $1 fee, as rows

The eventA customer deposits $100 and you charge a $1 fee. This is one economic event, so it is one transaction with multiple legs that must balance.

Step 1 of 5

4. Immutable append-only design and why you never update in place

The postings table is append-only: once a posting is written, it is never updated and never deleted. This is the property that turns a database table into a ledger you can trust, and it is worth being precise about why.

If a transaction was wrong, you do not erase it. You write a new reversing transaction, a fresh set of legs that are the mirror image of the original (a credit where there was a debit, equal in amount), which brings the affected balances back to where they were. Then, if needed, you write the corrected transaction. The original stays in the record forever. A single corrected mistake therefore leaves three transactions in the log: the error, its reversal, and the fix. That trail is not clutter; it is the literal truth about what happened, and it is what lets an auditor see not just the current state but every state the books were ever in.

sequenceDiagram
  participant C as Client
  participant L as Ledger
  C->>L: Post transaction T1 ($100 to wrong account)
  L-->>C: T1 posted
  Note over L: Error found. Do NOT edit T1.
  C->>L: Post reversing transaction R1 (mirror of T1)
  L-->>C: R1 posted, balances restored
  C->>L: Post corrected transaction T2 ($100 to right account)
  L-->>C: T2 posted
  Note over L: Log now holds T1, R1, T2 forever.

The append-only discipline buys three things that no in-place design can. It buys auditability: any past balance is reconstructable by replaying postings up to a chosen instant, so “what was this account’s balance at midnight on the last day of the quarter” is always answerable, even years later. It buys tamper-evidence: because nothing is overwritten, a missing or altered posting is detectable, and some systems chain a hash of each posting to the previous one so any edit to history is cryptographically visible. And it buys safe concurrency: appends do not contend the way in-place updates do, because two transactions appending different postings are not fighting over the same row.

The cost is real and worth naming: the log only grows, and computing a balance by summing a multi-year history is too slow to do on every read. That cost is exactly what the next section, derived versus materialized balances, exists to pay down. But the direction of the trade is fixed. The log is the truth, and we will build fast reads on top of it, never instead of it.

5. Derived balances versus materialized read models

A balance is, by definition, the sum of an account’s postings. That definition is the source of truth, but summing a long history on every read does not scale. The resolution is a deliberate split between two things that beginners conflate: the derived balance (the definitional sum, always correct, sometimes slow) and the materialized balance (a cached, precomputed number, fast to read, and never the source of truth).

The rule that keeps this safe is absolute: the materialized balance is a cache, and it must be reconstructable from the postings at any time. You update it by adding each new posting’s signed amount as you append the posting, ideally in the same atomic write, so the cache moves in lockstep with the log. You verify it by periodically recomputing the definitional sum and comparing; if they ever disagree, the log wins and the cache is rebuilt. A materialized balance that cannot be rederived from the log has quietly become a second source of truth, and the moment you have two sources of truth for money, you have a reconciliation problem with yourself.

-- Materialized balance: fast to read, kept in step with each append.
CREATE TABLE account_balances (
  account_id BIGINT PRIMARY KEY,
  balance    BIGINT NOT NULL,        -- SUM(direction * amount) of all postings so far
  currency   CHAR(3) NOT NULL,
  version    BIGINT NOT NULL         -- bumps per applied posting; guards lost updates
);

-- It must always equal the definitional sum over the log:
SELECT account_id, SUM(direction * amount) AS truth
FROM   postings
WHERE  account_id = $1
GROUP  BY account_id;
-- If account_balances.balance != truth, the cache is wrong. Rebuild from postings.

The animation declares the whole shape of this split up front: an event becomes a posting, the posting appends to the immutable log (the source of truth), the same posting updates the materialized balance (the fast read path), and a continuous reconciliation job compares the two. Watch how the log is always the thing the cache is checked against, never the other way around.

This split is also where the lost update failure lives and dies. If two postings hit the same account concurrently and both read the old materialized balance and write back, one update is lost and money silently disappears from the cache (the log is still correct, which is the saving grace). You prevent it the standard way: a version column with optimistic concurrency (a write only succeeds if the version is unchanged, otherwise it retries), or a row lock, or by routing all of one account’s writes through a single ordered path. The log remains correct regardless, but you want the cache correct too, so you do not lean on a nightly rebuild to catch a hot-path race.

6. The entry lifecycle states

A transaction is not always born final. Real money flows have an in-between: a customer’s card is authorized today and captured tomorrow, a transfer is initiated but not yet settled, a payment is held pending a risk review. The ledger models this with explicit lifecycle states, and getting them right is what keeps a balance honest about what is available versus what is merely promised.

The core states are pending (proposed, legs written but not yet final, often visible as a hold), posted (final and counted in the settled balance), and reversed (undone by a compensating transaction). On top of these sits the authorization or hold: a reservation that reduces the customer’s available balance without yet moving money, so a card auth for $50 leaves the settled balance untouched but stops the customer from spending the same $50 twice. When the merchant captures, the hold converts to a posted transaction; if it expires, the hold is released and the available balance returns.

stateDiagram-v2
  [*] --> Pending: transaction proposed
  Pending --> Posted: legs balance and write succeeds
  Pending --> Rejected: legs do not balance or risk declines
  Pending --> Expired: hold times out, funds released
  Posted --> Reversed: error found, write compensating entry
  Reversed --> [*]
  Posted --> [*]
  Rejected --> [*]
  Expired --> [*]
  note right of Posted
    Posted postings are never edited.
    Available balance = settled balance minus active holds.
  end note

This is why a serious ledger tracks two balances per account, not one. The settled balance is the sum of posted postings: money that has truly moved. The available balance is the settled balance minus active holds: money the customer can actually spend right now. Confusing the two is a classic production bug. If you let a customer spend against the settled balance while a hold is outstanding, they can double-spend the held funds; if you forget to release an expired hold, you strand the customer’s own money. The hold is itself recorded as postings (often into a dedicated holds or reserved account) so that even reservations keep the books double-entry and traceable.

A card authorization and capture, state by state

Authorize ($50 hold)The customer's card is authorized for $50. A pending hold is written: available balance drops by $50, but the settled balance is unchanged. No money has actually moved; the funds are reserved.

Step 1 of 4

7. Idempotency keys and exactly-once posting

Distributed systems retry. A client calls the ledger to post a payment, the network drops the response, and the client, not knowing whether the write landed, calls again. Without protection, the second call posts a second balanced transaction, and the customer is charged twice. Crucially, double-entry does not save you here: each duplicate transaction is internally balanced, so the trial balance stays at zero while the customer is out real money. Balancing constrains the shape of one transaction; it says nothing about how many you posted.

The fix is the idempotency key: the client generates a unique key for the logical operation and attaches it to every attempt. The ledger records which keys it has already applied (the UNIQUE constraint on idempotency_key in the transactions table does exactly this), and a repeat of a key returns the original transaction’s result instead of posting again. The retry safely collapses onto the first attempt. One real payment, one transaction, regardless of how many times the client tries.

A retried transfer, with and without an idempotency key

The client sends 'transfer $100', the network drops the response, the client retries. The ledger now holds TWO balanced transactions for one real payment: $200 moved, the customer is charged twice. Both transactions balance perfectly, so the trial balance is happy. Double-entry did not save you, because each duplicate was internally valid.

Two subtleties separate a correct idempotency design from a leaky one. First, the key must cover the logical operation, not the network call: the client owns the key and keeps it stable across retries, so a randomly regenerated key on each attempt defeats the entire mechanism. Second, the idempotent record and the postings must commit together. If you record the key in one transaction and write the postings in another, a crash between them leaves a key marked applied with no postings behind it (or postings with no key), and the retry either skips a real payment or duplicates it. The key insert and the legs belong in one atomic commit, which is exactly the atomicity guarantee the next section is about.

There is a deeper point here worth stating plainly. Balancing and idempotency defend against orthogonal failures. Balancing stops a single malformed entry that would create or destroy value. Idempotency stops a correct entry from being applied too many times. A production ledger needs both, and neither substitutes for the other.

8. Ordering atomicity and consistency guarantees

A ledger is one of the rare systems where you genuinely cannot trade away strong consistency on the write path. A social feed tolerates a like that arrives late; a ledger cannot tolerate a debit that posts without its credit, because that is money created from nothing. So the write path has hard requirements, and understanding them tells you exactly where you may relax and where you may not.

Atomicity is first. A transaction’s legs (and its idempotency record, and its materialized-balance updates) must commit all together or not at all. On a single database this is free: wrap the legs in one database transaction. The instant any leg can commit while another fails, you can have a posted debit with no credit, and the conservation invariant is broken. Atomicity is the property that makes a multi-leg transaction behave as one indivisible event.

Ordering is second. Within an account, postings have a definite order (a monotonic id or sequence), because “balance as of a moment” and replay both depend on a stable history. Across unrelated accounts, global order rarely matters; what matters is that each account’s own log is totally ordered, which is why partitioning by account works so well later.

Consistency model is third, and this is where the famous trade-off lives. The write path is strongly consistent: a transaction is either fully posted or not, with no in-between visible to anyone. The read path may be allowed to lag. A reporting query that reads a balance a few seconds stale is usually fine; a balance check that authorizes a payment must be current, or you will let a customer overdraw. So the honest framing is not “consistency versus availability” for the whole system; it is strong consistency on writes and on authorization reads, with eventual consistency tolerated only for reads that cannot cause an overdraw or a double-spend.

flowchart LR
  W["Write path<br/>post a transaction"] --> A["Atomic commit<br/>all legs or none"]
  A --> O["Ordered append<br/>per-account sequence"]
  O --> SC["Strongly consistent<br/>truth in the log"]
  SC --> AR["Authorization reads<br/>must be current"]
  SC --> RR["Reporting reads<br/>may lag a little"]

The failure modes this section prevents are the scary ones. A partial write (some legs committed, some not) is prevented by atomicity. A lost update (two writes clobbering one balance) is prevented by ordered, version-checked writes. Drift (the materialized balance diverging from the log) is caught by reconciliation and resolved by trusting the log. None of these can be papered over later; they have to be designed in at the write path, because by the time they show up in a balance, the money is already wrong.

9. Multi-currency and multi-asset ledgers

The single most important rule of a multi-currency ledger is the one beginners most want to break: you never sum two different currencies into one balance. A balance is only meaningful within one unit. Ten dollars plus ten euros is not twenty of anything; it is ten dollars and ten euros, two separate facts. So the design is not “an account with a currency field on each posting that you add up”; it is one account per currency, each with its own self-contained, balanced log.

This generalizes cleanly from currencies to assets: a ledger that tracks dollars, euros, bitcoin, and shares of a stock treats each as its own unit with its own accounts. The minor unit and scale differ (cents for dollars, satoshis for bitcoin, whole shares or fractional shares for equities), but the principle is identical: every posting carries its unit, every balance is single-unit, and the conservation invariant holds within each unit independently.

What about a transaction that crosses currencies, like a customer converting $100 into euros? You do not put a dollar leg and a euro leg in the same balance check, because they are different units and cannot sum to zero together. Instead you model the conversion as two balanced transactions joined by an FX position: dollars leave the customer’s USD account and enter your USD FX account (balanced in dollars), and euros leave your EUR FX account and enter the customer’s EUR account (balanced in euros). Your FX accounts now hold a long-dollar, short-euro position whose value you manage as a business, and each currency’s books independently balance. The exchange rate lives in the transaction metadata and in the FX position, not inside a single illegal mixed-currency leg.

Check yourself

A customer converts $100 to euros at 0.90 EUR per USD. How should the ledger record it so each currency's books still balance?

The participant lesson here is sharp. To an engineer, multi-currency is “carry the unit and never sum across it.” To an operator, it is a live FX position that has to be hedged and reconciled. To an auditor, it is the demand that every currency’s trial balance net to zero on its own, and that every conversion show two balanced halves and an explicit rate. One design choice (one account per unit) satisfies all three, which is exactly why it is the standard.

10. Partitioning sharding and scaling from one database to many

For a startup, the entire ledger is two tables on one database, every transaction commits in one local atomic write, and the strong consistency you need is free because one database gives it to you. This is the correct starting point, and many large fintechs run a remarkably long way on exactly this before changing anything. The first rule of scaling a ledger is: do not distribute it before you must, because every step away from one database costs you the atomicity that was previously free.

When volume genuinely outgrows one machine, the dominant pattern is partitioning by account: each account (and its postings) lives on a specific shard, chosen by hashing or ranging the account id. The payoff is that a transfer between two accounts on the same shard is still one local atomic write, and most traffic can be arranged to stay within a shard. Only a genuinely cross-shard transaction, where the legs live on different shards, has to pay for a distributed commit.

flowchart TB
  subgraph one["Stage 1 single ledger DB"]
    A["Two tables, all accounts and postings,<br/>one strongly consistent database"]
  end
  subgraph two["Stage 2 partition by account"]
    B["Shard accounts across DBs<br/>same-shard transfers stay one atomic write"]
    C["Cross-shard transactions use<br/>two-phase commit or a saga<br/>so all legs commit together"]
  end
  subgraph three["Stage 3 dedicated ledger service"]
    D["One service owns the invariant<br/>and exposes a narrow API<br/>no other service writes money directly"]
  end
  one --> two --> three

Cross-shard transactions are where the cost lands, and there are two main ways to pay it. A two-phase commit keeps the legs atomic across shards at the price of latency and a coordinator that must not fail at the wrong moment. A saga breaks the transaction into ordered local steps with compensating reversals, trading strict atomicity for availability: if a later step fails, you run the compensations, which (because the ledger is immutable) are themselves reversing transactions. The transactional outbox pattern is the connective tissue: a service writes its postings and an outgoing event in one local transaction, so the event that tells downstream systems is never lost even if the message broker is down. Across all of these, one thing never bends: a transaction’s legs must still commit together. You may shard storage, you may let reporting reads lag, but the moment a debit can outlive its credit across a shard boundary, you no longer have a ledger.

The honest engineering judgment is that sharding a ledger is expensive and you should reach for cheaper levers first: a bigger machine, read replicas for reporting, a materialized-balance cache so reads stop scanning the log, and partitioning cold history off the hot path. Distribution is the last lever, not the first, precisely because it is the one that threatens the invariant.

11. The ledger-as-a-service pattern

As an organization grows, many services want to touch money: payments, payouts, refunds, lending, rewards. The tempting shortcut is to let each one write to the ledger tables directly. This is how the invariant dies. If five services can write postings, then five teams can each, independently, write an unbalanced transaction, skip the idempotency key, sum two currencies, or update a balance in place. The conservation law is only as strong as the least careful writer.

The pattern that survives this is ledger-as-a-service: one component owns the ledger, owns the invariant, and exposes a narrow, well-guarded API (post a transaction, query a balance, place a hold). Every other service calls the ledger; none of them writes money directly. The ledger becomes the single chokepoint where balancing is enforced, idempotency is required, currencies are checked, and atomicity is guaranteed. No other service can violate the rule, because no other service has the keys to the table.

flowchart LR
  pay["Payments service"] --> L["Ledger service<br/>owns the invariant"]
  payout["Payouts service"] --> L
  lend["Lending service"] --> L
  reward["Rewards service"] --> L
  L --> db["Immutable postings<br/>source of truth"]
  L --> bal["Balance API<br/>derived / materialized"]

The API surface is deliberately small and opinionated, and that is the point. It accepts a transaction as a set of legs plus an idempotency key, validates that the legs balance and share legal currencies, commits them atomically, and returns the posted transaction or, on a repeated key, the original result. It refuses anything malformed. Because the surface is narrow, the ledger team can reason about every way money moves, evolve the storage (add sharding, swap the database) without breaking callers, and give auditors a single place where every monetary event is recorded under one set of rules.

Who is allowed to write money

Payments, payouts, lending, and rewards each write postings directly. Now any of five teams can ship an unbalanced transaction, a missing idempotency key, or an in-place balance update. The invariant holds only if all five teams are equally careful forever, which they will not be. The ledger drifts and no one owns the rule.

The trade-off is real: a service boundary adds a network hop and a dependency, and the ledger becomes a critical-path system that must be highly available. But the alternative, a money invariant guarded by convention across many teams, is not a trade-off a serious institution can accept. The boundary is the price of being able to trust the books.

12. Failure modes and the controls that catch each one

A ledger is self-checking, but it is not self-correcting, and it does not defend against every mistake for free. Knowing exactly which failures the structure catches and which require an external control is the difference between an engineer who trusts the design blindly and one who builds the right guardrails around it.

Ledger failure modes, and what actually catches each one

The pattern across that list is the single most important thing to carry away. The ledger’s internal machinery (balancing, atomicity, immutability, idempotency, one-account-per-currency) guarantees internal consistency: the books agree with themselves. It says nothing about external correctness: whether the books agree with reality. A set of books can be perfectly balanced and completely wrong, because every entry was internally valid while some were posted to the wrong account, duplicated, or missing entirely. This is exactly why reconciliation exists as a separate discipline: it checks the ledger against the outside world (the bank statement, the custodian report, the card-network settlement file) precisely where the internal checks are silent. The trial balance proves the books are consistent; reconciliation proves they are true.

13. Three participants reading the same ledger

A ledger design is good only if it serves the people who depend on it, and three of them read the very same log with very different questions. Designing for one and forgetting the others is how a technically correct ledger still fails in production.

The engineer asks: can I post a transaction atomically, read a balance fast, retry safely, and shard when I grow? Their needs drive the data model, the idempotency keys, the materialized read model, and the service boundary. They care that the invariant is enforced in code, on every write, so no caller can break it.

The operator asks: is money stuck, is a hold stranded, did the bank float drift from the statement, is a reconciliation breaking? Their needs drive the lifecycle states (so in-flight money is visible), the internal clearing and suspense accounts (so stuck value has a home you can watch), and the reconciliation jobs (so external drift surfaces fast). They live in the gap between the books and reality.

The auditor asks: can I reconstruct any balance as of any past instant, see every correction as a reversal rather than a rewrite, and trust that nothing was edited? Their needs drive immutability and append-only storage, the three-record correction trail, and the optional hash-chaining of postings. They care that history is complete and tamper-evident, because their job is to prove the books are honest, not just balanced.

flowchart TB
  log["One immutable ledger"]
  log --> eng["Engineer<br/>atomic writes, fast reads,<br/>safe retries, sharding"]
  log --> op["Operator<br/>in-flight money visible,<br/>holds and reconciliation"]
  log --> aud["Auditor<br/>reconstruct any past balance,<br/>complete tamper-evident trail"]

The unifying insight is that these are not three systems; they are three readings of one immutable log. The same append-only postings that let the engineer shard and retry are what let the operator trace stuck money and what let the auditor reconstruct the past. That is the deep reason the immutable-log design wins: it is the one structure that simultaneously answers the engineer’s “how do I scale this,” the operator’s “where is the money right now,” and the auditor’s “prove it was always honest.” Design for all three from the start, because they are all reading the truth you chose to store.

Mastery Questions

Your team wants reads to be fast, so an engineer proposes dropping the postings table entirely and keeping only an account_balances table that each service updates in place when money moves. They argue the postings table is just slow, redundant history. What is wrong with this, and what is the smallest change that makes fast reads safe?

Answer. The proposal deletes the source of truth and keeps only the cache, which inverts the one rule that makes a ledger trustworthy. An in-place balance with no underlying log has no history, so a wrong balance cannot be explained or reconstructed, and “what was this balance last quarter” becomes unanswerable. It is exposed to lost updates, where two concurrent writes read the same value and one clobbers the other, silently losing money. And it has nothing to check itself against, so the self-checking power of double-entry is gone and reconciliation has no internal record to compare to the bank. The smallest safe change is the opposite of deletion: keep the immutable postings as the source of truth, and add the account_balances table as a materialized cache that is updated as each posting appends and is periodically recomputed from the postings to catch drift. Fast reads are fine; a fast read model that is the only record is not. The distinction is which one wins when they disagree, and the log must always win.
A payments service and a payouts service both need to record money movements. One engineer wants each service to own its own postings tables for independence and speed; another wants a single ledger service that both call. Argue the case, and name the failure the wrong choice invites.

Answer. The single ledger service is correct, and the reason is that the conservation invariant is only as strong as the least careful writer. If two services own two sets of postings, then two teams can independently ship an unbalanced transaction, skip an idempotency key, update a balance in place, or sum two currencies, and there is no single place that enforces the rule on every write. The invariant degrades into a convention that every team must honor forever, which is exactly the kind of guarantee that erodes the moment someone is in a hurry. Worse, money that moves between the two services now spans two ledgers, so a single economic event is recorded as two loosely coupled halves that can drift, and reconciling them becomes a permanent tax. The ledger-as-a-service pattern puts balancing, idempotency, currency checks, and atomicity behind one narrow API owned by one team, so no caller can break the rule because no caller has the keys to the table. The cost is a network hop and a critical-path dependency, which is a price a serious institution pays gladly to be able to trust its own books.
A customer authorizes a $50 card payment, and your support team sees the customer complaining that $50 is missing even though no purchase shows on their settled history. The trial balance is clean. What is most likely happening, and what does it tell you about how you must track balances?

Answer. Almost certainly the customer is looking at their available balance, which is the settled balance minus active holds, and a $50 authorization hold is outstanding from the card auth. No money has actually moved (the settled history is correctly empty of a purchase), and the trial balance is clean because the hold is itself recorded as balanced postings into a reserved or holds account. The $50 is not missing; it is reserved, pending capture or expiry. This tells you that a single balance per account is not enough: you must track at least two, the settled balance (the sum of posted postings, money that truly moved) and the available balance (settled minus active holds, money the customer can spend right now). If you collapse them into one, you either let the customer spend the held $50 twice (if you show only settled) or you make it look like real money vanished (if you show only available without explaining the hold). The lifecycle states exist precisely so the ledger can be honest about the difference between money that has moved and money that is merely promised, and a good design surfaces both numbers, with the hold visible and time-bounded so it cannot strand the customer’s funds forever.