- rtshkmr's digital garden/
- References/
- Architecture Design Basics/
- Pattern Taxonomy/
- Data Storage & Retrieval/
- Write-Ahead Logging/
Write-Ahead Logging
··
173 words·
1 min
Table of Contents
🟠P1 — the durability mechanism underlying every serious database
Problem #
If a database crashes mid-write, data is corrupted or lost. We need crash recovery without sacrificing write performance.
Mechanism #
Client write request
↓
1. Append to WAL (sequential write to disk — fast)
↓
2. Update in-memory data structure
↓
3. Acknowledge to client
↓
(Later) Flush in-memory data to data files (checkpoint)The WAL is an append-only log of every mutation. On crash recovery, replay the WAL to reconstruct in-memory state. Since the WAL is sequentially written and fsynced, it’s both fast and durable.
Key Trade-offs #
- Durability vs latency:
fsyncon every write guarantees durability but adds ~1ms. Batching fsyncs improves throughput but risks losing the last batch on crash. - WAL size: Must be periodically truncated after check pointing. If checkpoints are infrequent, WAL grows large and recovery time increases.
Instinct #
WAL is not a pattern you implement — it’s a pattern you understand. Knowing that PostgreSQL’s WAL is the basis for replication (streaming replication sends WAL records) shows database internals knowledge. It also connects to event sourcing conceptually: both are append-only logs of mutations.
DDIA 2e Reference #
- Chapter 3: B-Tree crash recovery, LSM memtable durability
- Chapter 5: Replication via WAL shipping