- rtshkmr's digital garden/
- References/
- Architecture Design Basics/
- Pattern Taxonomy/
- Encoding & Data Formats/
- Serialisation Trade-offs/
Serialisation Trade-offs
··
286 words·
2 mins
Table of Contents
🔴 P0 — DDIA Ch4 core content; underpins the gRPC proxy pattern and API versioning
Problem #
Data must be encoded to cross process/network boundaries. The encoding format determines: size on the wire, encode/decode speed, schema evolution capability, and human debuggability.
Mechanism #
| Format | Encoding | Schema? | Evolution | Human-readable | Typical use |
|---|---|---|---|---|---|
| JSON | Text | Optional (OpenAPI) | Weak (additive) | ✅ Yes | REST APIs, config |
| Protobuf | Binary | Required (.proto) | Strong (field#) | ❌ No | gRPC, internal services |
| Avro | Binary | Required (.avsc) | Strong (names) | ❌ No | Hadoop, Kafka, analytics |
| MessagePack | Binary | No | Weak | ❌ No | Redis, embedded |
| Thrift | Binary | Required (.thrift) | Strong (field#) | ❌ No | Legacy Facebook stack |
Schema Evolution: The Critical Differentiator #
Forward compatibility: Old code can read data written by new code (new fields ignored). Backward compatibility: New code can read data written by old code (missing fields get defaults).
Protobuf achieves this via field numbers:
message User {
string name = 1; // field 1, always present
string email = 2; // field 2, added later
int32 age = 3; // field 3, added even later (optional)
}
→ Old code seeing field 3 ignores it (forward compat)
→ New code missing field 3 uses default 0 (backward compat)
→ NEVER reuse or change a field numberKey Trade-offs #
Size vs Debuggability #
- JSON: ~2-10× larger than binary formats, but you can
curlan endpoint and read the response - Protobuf: compact binary, but you need
grpcurlor proto definitions to decode
Schema Enforcement vs Flexibility #
- Schema-required formats (Protobuf, Avro): catch breaking changes at compile time, but require schema registries and coordination
- Schema-optional formats (JSON): flexible and fast to iterate, but breaking changes discovered in production
Instinct #
Internal services: Protobuf. The schema enforcement and binary efficiency pay for themselves at scale. External APIs: JSON. Developer experience trumps wire efficiency for public-facing surfaces. The translation boundary is the proxy — see also: Web-gRPC Proxy.
References #
- Schema Evolution in Avro, Protocol Buffers and Thrift — Martin Kleppmann
- Confluent Schema Registry — production schema evolution management
- Protocol Buffers: API Best Practices — Google
DDIA 2e Reference #
- Chapter 4 (Encoding and Evolution): This IS Chapter 4. Sections 4.1–4.4 are essential.