Skip to main content
  1. References/
  2. Architecture Design Basics/
  3. Pattern Taxonomy/
  4. Scaling & Performance/

Cache Patterns

·· 654 words· 4 mins

🔴 P0 — the primary tool for read scaling; multiple patterns with different consistency guarantees

Problem #

Databases are slow for repeated reads. Caching reduces latency and database load but introduces consistency challenges (stale data) and operational complexity (cache warming, eviction).

Patterns #

PatternWrite pathRead pathConsistency
Cache-AsideWrite to DB onlyCheck cache → miss → read DB → populate cacheEventually consistent
Read-ThroughWrite to DB onlyCache handles DB fetch on missEventually consistent
Write-ThroughWrite to cache AND DB (sync)Always read from cacheStrong (but slow writes)
Write-BehindWrite to cache, async flush to DBAlways read from cacheRisk of data loss
Refresh-AheadCache proactively refreshes before TTLAlways read from cacheNear-real-time

Cache-Aside (Most Common) #

Read:
  1. Check cache for key
  2. HIT  → return cached value
  3. MISS → query database
  4. Store result in cache with TTL
  5. Return result

Write:
  1. Write to database
  2. Invalidate cache key (don't update — see Cache Invalidation)

Distributed Caching (Redis Patterns) #

Key patterns beyond simple key-value:

  • Sorted Sets for leaderboards: ZADD → O(log N) ranked access
  • Hash for structured objects: HSET → partial updates without full serialisation
  • Pub/Sub for cache invalidation: broadcast invalidation events across instances
  • Lua scripting for atomic ops: check-and-set without race conditions

CDN Patterns #

  • Static assets: Cache forever with content-hashed URLs
  • API responses: Short TTLs + Surrogate-Key headers for targeted purging
  • Edge compute: Cloudflare Workers / Lambda@Edge

Instinct: CDN is your first line of defence for read scaling. Even 5-second edge caching keeps thousands of requests from reaching your backend.

Instinct #

Cache-Aside is the default.

Write-Through for systems that can’t tolerate any cache misses.

Write-Behind only when you accept the durability risk (write buffer).

Never cache without a TTL — it’s the safety net against staleness.

Invalidate on write, don’t update — avoids race conditions between concurrent writers.

See also: Cache Invalidation for the hard part.

Interview Choreography: When and How to Introduce Caching #

When to Introduce #

  • INTERVIEW: Establish necessity first. Don’t reach for caching by default — identify the performance problem, quantify it with rough numbers, then explain the value of caching:

Read-heavy workload: #

We’re serving 10M daily active users, each making 20 requests per day. That’s 200M reads hitting the database. Even with indexes, we’re looking at 20-50ms per query. A cache drops that to under 2ms and takes most of the load off the database.

Expensive queries: #

Computing a user’s personalised feed requires joining posts, followers, and likes across multiple tables. That query takes 200ms. We can cache the computed feed for 60 seconds and serve it in 1ms from Redis.

Latency requirements: #

We need sub-10ms response times for the API. Database queries are taking 30-50ms. We have to cache.

How to Introduce (5-Step Approach) #

  1. Identify the bottleneck: the specific problem — DB load? query latency? expensive computations?
  2. Decide what to cache: focus on stable, frequent reads and expensive computations. Consider cache-key design: how will we look up the data?
  3. Choose cache architecture:
    • Strong consistency needed → write-through cache
    • High volume, durability risk tolerable → write-behind
    • Static content → CDN caching
    • Extremely hot keys → in-process caching as an optimisation layer
  4. Set eviction policy: LRU is the safe default. TTLs for freshness.
  5. Address the tradeoffs:
    • INTERVIEW: Don’t enumerate all failure modes — pick the 1-2 most relevant for the system. FLEX:, focus on important but non-obvious scenarios (gotchas).

Caching Layers #

  • INTERVIEW: Start with the basic: external caching via Redis. Layer on additional caching (CDN, client-side, in-process) only when justified by the access pattern.
  • INTERVIEW: Use in-process caching for small, stable, frequently-accessed values: config values, feature flags, small reference datasets, hot keys, rate-limiting counters, pre-computed values. This is an optimisation layer after external cache, not a replacement.
  • INSIGHT: CDN caching is primarily for static media at scale. Modern CDNs can cache API responses and do edge logic, but in interviews, only introduce CDNs for serving static media.

References #

DDIA 2e Reference #

  • Chapter 5: Caching as a form of replication
  • Chapter 12: Derived data and materialised views