Skip to main content
  1. References/
  2. Architecture Design Basics/
  3. Pattern Taxonomy/
  4. Encoding & Data Formats/

Serialisation Trade-offs

·· 286 words· 2 mins

🔴 P0 — DDIA Ch4 core content; underpins the gRPC proxy pattern and API versioning

Problem #

Data must be encoded to cross process/network boundaries. The encoding format determines: size on the wire, encode/decode speed, schema evolution capability, and human debuggability.

Mechanism #

FormatEncodingSchema?EvolutionHuman-readableTypical use
JSONTextOptional (OpenAPI)Weak (additive)✅ YesREST APIs, config
ProtobufBinaryRequired (.proto)Strong (field#)❌ NogRPC, internal services
AvroBinaryRequired (.avsc)Strong (names)❌ NoHadoop, Kafka, analytics
MessagePackBinaryNoWeak❌ NoRedis, embedded
ThriftBinaryRequired (.thrift)Strong (field#)❌ NoLegacy Facebook stack

Schema Evolution: The Critical Differentiator #

Forward compatibility: Old code can read data written by new code (new fields ignored). Backward compatibility: New code can read data written by old code (missing fields get defaults).

Protobuf achieves this via field numbers:

message User {
  string name  = 1;    // field 1, always present
  string email = 2;    // field 2, added later
  int32  age   = 3;    // field 3, added even later (optional)
}

→ Old code seeing field 3 ignores it (forward compat)
→ New code missing field 3 uses default 0 (backward compat)
→ NEVER reuse or change a field number

Key Trade-offs #

Size vs Debuggability #

  • JSON: ~2-10× larger than binary formats, but you can curl an endpoint and read the response
  • Protobuf: compact binary, but you need grpcurl or proto definitions to decode

Schema Enforcement vs Flexibility #

  • Schema-required formats (Protobuf, Avro): catch breaking changes at compile time, but require schema registries and coordination
  • Schema-optional formats (JSON): flexible and fast to iterate, but breaking changes discovered in production

Instinct #

Internal services: Protobuf. The schema enforcement and binary efficiency pay for themselves at scale. External APIs: JSON. Developer experience trumps wire efficiency for public-facing surfaces. The translation boundary is the proxy — see also: Web-gRPC Proxy.

References #

DDIA 2e Reference #

  • Chapter 4 (Encoding and Evolution): This IS Chapter 4. Sections 4.1–4.4 are essential.