Cache Patterns

Table of Contents

🔴 P0 — the primary tool for read scaling; multiple patterns with different consistency guarantees

Problem #

Databases are slow for repeated reads. Caching reduces latency and database load but introduces consistency challenges (stale data) and operational complexity (cache warming, eviction).

Patterns #

Pattern	Write path	Read path	Consistency
Cache-Aside	Write to DB only	Check cache → miss → read DB → populate cache	Eventually consistent
Read-Through	Write to DB only	Cache handles DB fetch on miss	Eventually consistent
Write-Through	Write to cache AND DB (sync)	Always read from cache	Strong (but slow writes)
Write-Behind	Write to cache, async flush to DB	Always read from cache	Risk of data loss
Refresh-Ahead	Cache proactively refreshes before TTL	Always read from cache	Near-real-time

Cache-Aside (Most Common) #

Read:
  1. Check cache for key
  2. HIT  → return cached value
  3. MISS → query database
  4. Store result in cache with TTL
  5. Return result

Write:
  1. Write to database
  2. Invalidate cache key (don't update — see Cache Invalidation)

Distributed Caching (Redis Patterns) #

Key patterns beyond simple key-value:

Sorted Sets for leaderboards: ZADD → O(log N) ranked access
Hash for structured objects: HSET → partial updates without full serialisation
Pub/Sub for cache invalidation: broadcast invalidation events across instances
Lua scripting for atomic ops: check-and-set without race conditions

CDN Patterns #

Static assets: Cache forever with content-hashed URLs
API responses: Short TTLs + Surrogate-Key headers for targeted purging
Edge compute: Cloudflare Workers / Lambda@Edge

Instinct: CDN is your first line of defence for read scaling. Even 5-second edge caching keeps thousands of requests from reaching your backend.

Instinct #

Cache-Aside is the default.

Write-Through for systems that can’t tolerate any cache misses.

Write-Behind only when you accept the durability risk (write buffer).

Never cache without a TTL — it’s the safety net against staleness.

Invalidate on write, don’t update — avoids race conditions between concurrent writers.

See also: Cache Invalidation for the hard part.

Interview Choreography: When and How to Introduce Caching #

When to Introduce #

INTERVIEW: Establish necessity first. Don’t reach for caching by default — identify the performance problem, quantify it with rough numbers, then explain the value of caching:

Read-heavy workload: #

We’re serving 10M daily active users, each making 20 requests per day. That’s 200M reads hitting the database. Even with indexes, we’re looking at 20-50ms per query. A cache drops that to under 2ms and takes most of the load off the database.

Expensive queries: #

Computing a user’s personalised feed requires joining posts, followers, and likes across multiple tables. That query takes 200ms. We can cache the computed feed for 60 seconds and serve it in 1ms from Redis.

Latency requirements: #

We need sub-10ms response times for the API. Database queries are taking 30-50ms. We have to cache.

How to Introduce (5-Step Approach) #

Identify the bottleneck: the specific problem — DB load? query latency? expensive computations?
Decide what to cache: focus on stable, frequent reads and expensive computations. Consider cache-key design: how will we look up the data?
Choose cache architecture:
- Strong consistency needed → write-through cache
- High volume, durability risk tolerable → write-behind
- Static content → CDN caching
- Extremely hot keys → in-process caching as an optimisation layer
Set eviction policy: LRU is the safe default. TTLs for freshness.
Address the tradeoffs:
- INTERVIEW: Don’t enumerate all failure modes — pick the 1-2 most relevant for the system. FLEX:, focus on important but non-obvious scenarios (gotchas).

Caching Layers #

INTERVIEW: Start with the basic: external caching via Redis. Layer on additional caching (CDN, client-side, in-process) only when justified by the access pattern.
INTERVIEW: Use in-process caching for small, stable, frequently-accessed values: config values, feature flags, small reference datasets, hot keys, rate-limiting counters, pre-computed values. This is an optimisation layer after external cache, not a replacement.
INSIGHT: CDN caching is primarily for static media at scale. Modern CDNs can cache API responses and do edge logic, but in interviews, only introduce CDNs for serving static media.

References #

Redis Patterns — official documentation
Scaling Memcache at Facebook — Nishtala et al. (2013); the canonical paper

DDIA 2e Reference #

Chapter 5: Caching as a form of replication
Chapter 12: Derived data and materialised views