- rtshkmr's digital garden/
- References/
- Architecture Design Basics/
- Pattern Taxonomy/
- Fundamental Concepts/
- Back-of-Envelope Estimation/
Back-of-Envelope Estimation
··
397 words·
2 mins
Table of Contents
🔴 P0 — expected in every system design interview; grounds design decisions in numbers
Problem #
Design decisions need quantitative grounding. “Should we cache this?” depends on how many reads per second and how large the dataset is. Without estimation, you’re guessing.
Key Numbers to Internalise #
Latency #
| Operation | Time |
|---|---|
| L1 cache reference | ~1 ns |
| L2 cache reference | ~4 ns |
| Main memory reference | ~100 ns |
| SSD random read | ~16 μs |
| HDD random read | ~2 ms |
| Same-datacentre round trip | ~500 μs |
| Cross-continent round trip | ~150 ms |
Throughput #
| Resource | Throughput |
|---|---|
| SSD sequential read | ~1 GB/s |
| HDD sequential read | ~100 MB/s |
| 1 Gbps network | ~100 MB/s |
| 10 Gbps network | ~1 GB/s |
| Single Redis instance | ~100k ops/s |
| Single PostgreSQL instance | ~10k-50k queries/s |
| Single application server | ~1k-10k requests/s |
Storage #
| Data | Size |
|---|---|
| 1 million users × 1KB each | ~1 GB |
| 1 billion rows × 100B each | ~100 GB |
| 1 day of logs at 10K rps | ~50-100 GB |
Availability #
| SLA | Downtime/year | Downtime/month | Downtime/week |
|---|---|---|---|
| 99% | 3.65 days | 7.31 hours | 1.68 hours |
| 99.9% | 8.77 hours | 43.83 minutes | 10.08 minutes |
| 99.99% | 52.6 minutes | 4.38 minutes | 1.01 minutes |
| 99.999% | 5.26 minutes | 26.3 seconds | 6.05 seconds |
Powers of Two #
| Power | Exact | Approx. | Size |
|---|---|---|---|
| 10 | 1,024 | ~1K | 1 KB |
| 20 | 1,048,576 | ~1M | 1 MB |
| 30 | 1,073,741,824 | ~1B | 1 GB |
| 40 | ~1.1 trillion | ~1T | 1 TB |
Mechanism #
The estimation framework:
- Clarify what you’re estimating (QPS? Storage? Bandwidth?)
- Start from user-facing numbers (DAU, actions/user/day)
- Derive system-level numbers (QPS = DAU × actions/day / 86400)
- Apply 80/20 rule for peak: peak QPS ≈ 2-3× average
- Compare against known limits (can one Postgres handle this?)
Instinct #
Estimation is about order-of-magnitude, not precision. The goal is to determine whether you need 1 server or 100, whether you need caching or not, whether you need sharding or not. Getting within 2-5× of reality is sufficient.
- RULE OF THUMB: Use estimation numbers at the point of balancing trade-offs, not from the start. Don’t open the interview with estimation; use it when justifying a specific decision (e.g. “should we shard?”).
- INSIGHT: Modern hardware defers the need for sharding and caching much further than older hardware would have. A single PostgreSQL instance comfortably handles a few TB and ~50K queries/s. Don’t over-scale based on outdated mental models.
Reference #
- hellointerview: Numbers to Know
- Jeff Dean’s “Numbers Everyone Should Know” (originally from a Google talk, extracted)