Skip to main content
  1. References/
  2. Architecture Design Basics/
  3. Pattern Taxonomy/
  4. Deployment & Evolution/

Zero-Downtime Migrations

·· 183 words· 1 min

🟠 P1 — underrated staff-level signal; how to change schemas on a live system

Problem #

You need to rename a column, change a data type, or split a table — but the application is serving traffic 24/7 and you can’t take it down.

Expand-Contract Pattern #

Phase 1 (Expand):
  - Add new column (nullable or with default)
  - Deploy code that writes to BOTH old and new columns
  - Backfill: copy data from old column to new column

Phase 2 (Migrate):
  - Deploy code that reads from new column
  - Verify correctness

Phase 3 (Contract):
  - Deploy code that stops writing to old column
  - Drop old column

Dual-Write Pattern (for service migrations) #

Phase 1: Write to old service + new service simultaneously
Phase 2: Read from new service (verify against old)
Phase 3: Stop writing to old service
Phase 4: Decommission old service

Dual-Write Pitfalls #

  • Inconsistency window: If write to new succeeds but old fails, systems diverge
  • Ordering: Writes arrive in different order → divergence. Prefer CDC over application-level dual-write
  • Performance: Every write now takes 2× latency. Consider async writes to the new system

Instinct: “Prefer CDC over application-level dual-write whenever possible. CDC captures the database’s actual write stream — no application bugs, no ordering issues.”

Instinct #

Every migration is a multi-deploy operation. Never combine schema change and code change in one deploy. The expand-contract pattern ensures backward compatibility at every step. The hardest part is usually the backfill: it must be idempotent, resumable, and rate-limited to avoid overloading the database.

References #