Skip to main content
  1. References/
  2. Architecture Design Basics/
  3. Pattern Taxonomy/
  4. Reliability, Consistency & Synchronisation/

Ordering & Causality

·· 265 words· 2 mins

🟠 P1 — how distributed systems reason about “what happened before what”

Problem #

In a single process, events have natural total order. In distributed systems, there’s no global clock β€” “what happened first?” is a hard question.

Key Concepts #

Happens-Before Relation (Lamport, 1978) #

Event A happens-before B if:

  1. A and B in same process and A comes first, OR
  2. A is send, B is corresponding receive, OR
  3. Transitivity: A β†’ C β†’ B

If neither A β†’ B nor B β†’ A, events are concurrent.

Logical Clocks #

Clock TypeTracksLimitation
Lamport ClockMonotonic counter per event/messageCan’t distinguish concurrent from ordered
Vector ClockOne counter per process (vector)Detects concurrency; size grows with N
Hybrid LogicalPhysical time + logical counterBounded size; used by CockroachDB

Total vs Partial Ordering #

  • Partial: Only causally related events ordered (happens-before)
  • Total: Every pair ordered. Requires consensus (Raft log). Expensive.

Instinct #

Most systems need causal ordering at most, not total. Social feeds need causal (replies after originals). Payment ledgers need total (single sequence per account). Choose the cheapest ordering that satisfies your requirements.

Framing #

Here’s an example of how to frame the balancing of tradeoffs.

“For the activity feed, causal consistency suffices. For the payment ledger, we need total ordering within an account β€” a single-leader database gives us that within a partition.”

References #

DDIA 2e Reference #

  • Chapter 8: Ordering guarantees and unreliable clocks
  • Chapter 9: Total order broadcast, consensus, and their equivalence