Skip to main content
  1. References/
  2. Architecture Design Basics/
  3. Pattern Taxonomy/
  4. Fundamental Concepts/

Statelessness & State Management

·· 285 words· 2 mins

🔴 P0 — appears in nearly every system design interview

Problem #

Every service must decide where state lives. The decision cascades into scaling, failure recovery, deployment strategy, and operational complexity. Getting this wrong means every subsequent decision inherits unnecessary constraints.

Mechanism #

The spectrum:

Fully Stateless              Stateful (in-memory)           Externalized State
─────────────────────────────────────────────────────────────────────────────
  No local state.              State lives in process         State in external
  All state fetched            memory. Fast, but              store (DB, Redis,
  per-request from             couples scaling to             S3). Services are
  external store.              instance lifecycle.            stateless wrappers.
ApproachScalingFailure RecoveryComplexity
Fully statelessTrivial (add instances)Trivial (any instance works)State access cost
StatefulRequires sticky routingState lost on crashLowest latency
ExternalisedNear-trivial + store scalesService restarts OKStore becomes SPOF

Key Trade-offs #

Latency vs Operational Simplicity #

  • Stateful in-memory: sub-millisecond access, but you’re coupled to instance lifecycle
  • Externalised state: network hop per access (~1-5ms to Redis, ~5-20ms to DB), but instances are disposable

Scaling Friction #

  • Stateless services scale horizontally with zero coordination
  • Stateful services need sticky sessions or consistent hashing to route requests to the right instance — see also: Consistent Hashing

Instinct #

Default to stateless with externalised state. Only pull state into the process when latency requirements demand it (e.g. hot-path caching, connection pools). When you do hold state in-memory, treat it as a cache with an authoritative external source — never as the source of truth.

  • INTERVIEW: Statelessness is almost never directly probed in system design interviews. It’s a foundational assumption that should be baked into the design from the start. If an interviewer detects a misconception about statefulness (e.g. storing session state in process memory without external backing), that’s a red flag.

References #

DDIA 2e Reference #

  • Chapter 1: Reliability, scalability, maintainability framing
  • Chapter 6: Partitioning (state distribution)
  • Chapter 5: Replication (state durability)