- rtshkmr's digital garden/
- References/
- Architecture Design Basics/
- Pattern Taxonomy/
- Fundamental Concepts/
- Statelessness & State Management/
Statelessness & State Management
··
285 words·
2 mins
Table of Contents
🔴 P0 — appears in nearly every system design interview
Problem #
Every service must decide where state lives. The decision cascades into scaling, failure recovery, deployment strategy, and operational complexity. Getting this wrong means every subsequent decision inherits unnecessary constraints.
Mechanism #
The spectrum:
Fully Stateless Stateful (in-memory) Externalized State
─────────────────────────────────────────────────────────────────────────────
No local state. State lives in process State in external
All state fetched memory. Fast, but store (DB, Redis,
per-request from couples scaling to S3). Services are
external store. instance lifecycle. stateless wrappers.| Approach | Scaling | Failure Recovery | Complexity |
|---|---|---|---|
| Fully stateless | Trivial (add instances) | Trivial (any instance works) | State access cost |
| Stateful | Requires sticky routing | State lost on crash | Lowest latency |
| Externalised | Near-trivial + store scales | Service restarts OK | Store becomes SPOF |
Key Trade-offs #
Latency vs Operational Simplicity #
- Stateful in-memory: sub-millisecond access, but you’re coupled to instance lifecycle
- Externalised state: network hop per access (~1-5ms to Redis, ~5-20ms to DB), but instances are disposable
Scaling Friction #
- Stateless services scale horizontally with zero coordination
- Stateful services need sticky sessions or consistent hashing to route requests to the right instance — see also: Consistent Hashing
Instinct #
Default to stateless with externalised state. Only pull state into the process when latency requirements demand it (e.g. hot-path caching, connection pools). When you do hold state in-memory, treat it as a cache with an authoritative external source — never as the source of truth.
- INTERVIEW: Statelessness is almost never directly probed in system design interviews. It’s a foundational assumption that should be baked into the design from the start. If an interviewer detects a misconception about statefulness (e.g. storing session state in process memory without external backing), that’s a red flag.
References #
- 12-Factor App: Processes — “execute the app as one or more stateless processes”
- See also: Consistent Hashing (for stateful routing), Cache Patterns (in-memory state as cache)
DDIA 2e Reference #
- Chapter 1: Reliability, scalability, maintainability framing
- Chapter 6: Partitioning (state distribution)
- Chapter 5: Replication (state durability)