Skip to main content
  1. Posts/
  2. Series/

🐫 Songs of the OCaml Compiler

I recently found myself with some time and the chance to finally pick up OCaml. Its use in correctness-first systems is what had first caught my eye back in my University days If I remember right, the earliest glimpse I had of OCaml in industry came from this talk, hosted by the school’s hacker community where an alumni from a prop-trading shop came down to share what they were building. ; I never had the chance to explore it deeply. Happily, this aligned with a long-term personal goal of building meaningful pedagogical artefacts An example: meaningful guided projects with an effortful learning curve. for FOSS ecosystems.

So I set an impossible goal: spend two weeks learning OCaml’s core patterns-of-thought by designing a proxy project A deliberately constructed side project whose primary purpose is to catalyse learning in a new domain. Instead of pursuing the project’s outcome as an end in itself, the creator uses it as a proxy for skill-building and intuition-forming under realistic conditions. as the learning substrate – a deterministic, dependency-free event-simulator for the classic Paxos consensus protocol. ~4 weeks later Scope-creep is inevitable for personal projects. This also doesn’t count the months between finishing the code and finishing the writing — consulting work intervened, and the series sat until I could give it the attention it needed. The open git history chronicles the actual timeline. , the project is at a stage where I can focus on talking about it.

Teaching people how to teach themselves is difficult because learning is a deeply personal habit and not every skill is learned the same, but a reliable start to exploring that is to make my own learning style visible.
Figure 1: 🙙 Photo of the Milky Way seen from the Gobi desert by Daniel Kordan, retrieved 🙛

This series documents the design thinking, the friction, and the moments where the compiler shifted from adversary to collaborator. The experience had an unexpected shape to it. What started as noise – type errors I couldn’t explain, constraints that felt arbitrary – gradually acquired a rhythm. Eventually, I caught myself applying the compiler’s discipline in places where it had nothing left to say. That arc – from bewilderment to internalisation Three movements, to be precise: the compiler refused my types and I learned why (obeying). I pushed the type system toward state-machine invariants and it showed me where it stops (negotiating). Then the compiler went silent, and I found myself thinking in its grammar anyway (internalising). These are the three design pressures that Part II traces. – is what the series tries to make visible. The Songs of the OCaml Compiler.

What This Series Is (and Is Not) #

This is not a Paxos tutorial, an OCaml syntax guide, or a polished retrospective with the rough edges filed off.

It’s a record of what happens when you walk into an unfamiliar type system and let it change how you think. Real compiler errors, real design mistakes, real moments where confusion became insight. The roughness is the point — it shows the learning as it happened rather than present cleaned-up conclusions.

The narrative follows a set of Tuareg trade caravans The Tuareg people shepherded vast caravans across the Sahara — formations of camels carrying salt, spices and stories, coordinating through human couriers riding the fastest dromedaries. Caravans become nodes. Couriers become fragile network packets. Sandstorms become network partitions. Even the major Emacs mode for OCaml is called “Tuareg”. crossing the Sahara as the backdrop for the Paxos Protocol It’s a classic in the field of distributed systems where multiple coordinating entities attempt to agree on things amidst operating in chaotic environments. It’s a consensus-protocol. Nevertheless, if it’s your first time coming across this area of Computer Science this series should still be a fun, and intuitive, foray into it. and traces three design pressures where the OCaml compiler shaped my architecture.

Multiple Cuts, Multiple Readers #

Technical writing forces a particular epistemic clarity The discipline of articulating why a design decision was made – not just what was decided – reveals gaps in understanding that code alone can hide. Every engineer’s process has lessons worth sharing; the difficulty is making them legible – simple yet entertaining. These posts are my attempt. on the author, gives a window for others into how different people think, and — when done carefully — produces a re-readable body of work that is worth revisiting as the reader’s own experience grows. That’s the brand of repeatability Gold standard: I have watched “The Shawshank Redemption” about 11 times throughout my growing years and as a young adult. Beyond its consistent coda on the unyielding nature of hope within a man, there’s always something new you pick up when re-watching it. In a few years’ time, it would be a dream to one day be able to retrospect and see that I have created some pieces of work that are my own mini-Shawshanks. that this series aspires to have.

The writing serves multiple audiences simultaneously, and the structure reflects this: collapsible depth layers Technical details, cross-language comparisons, and historical digressions live inside expandable sections. The main narrative reads cleanly without them; opening them rewards curiosity without punishing pace. In some cases, it also assists the reader if they find themselves in unfamiliar conceptual territory — and gives them some intuition on certain technical fundamentals. let readers choose their depth, interactive diagrams The architecture map in Part II is clickable – each box links directly to the relevant source file in the tagged release. It’s my preferred way of being introduced to a new codebase: structure first, then drill into the code at the points that interest you \(\ldots\) like a map with points of interest. link directly to source code, and cross-language comparisons (Python, Elixir, TypeScript) sit alongside OCaml-specific material so that the ideas transfer.

Suggested Reading Paths #

The result is multiple cuts through the same body of work for different profiles of people:

If you are \(\ldots\)\(\ldots\) you care aboutthen try starting with \(\ldots\)
someone interested in how engineers learn and make design decisions — pedagogues, mentors, the curious — even if you don’t consider yourself a technical personthe why behind design: forces, analogies, constraintsPart I — no code, all design thinking. Wanders into linguistics, learning science, and the cognitive residues that programming languages leave behind.
a strong programmer outside the OCaml world — you build systems, you’ve used a few languages, you want the architectural substance juxtaposed with the code and see it in a seemingly esoteric languagethe how: trade-offs, type-system lessons, cross-language perspectivesPart II’s three design pressures — opens with enough context to stand alone. Part I’s Abstract Design Grammar is worth a look if the design constraints intrigue you.
someone that works in or around the OCaml ecosystem — you’re evaluating the engineering, the taste, the self-awarenessthe what: implementation quality, honest debt, maturity of judgementPart II directly, focusing on the design pressures. Part III’s debt ledger will be the most candid section about where the architecture falls short.

Each part is self-contained. Reading in order (Part I → II → III) gives the full arc – what I see as noise becoming melody – but entering at any point (and reading discrete slices of it instead of the whole cake at once) works as well.


The Three Parts #

PartThe experience
I: A Proxy ProjectNo code. Design forces, abstract grammar, constraints before constructs. Personal narrative on linguistic relativity and what programming languages teach us.
II: A Design TourThe most technically involved part. Three design pressures with real compiler errors, architectural diagrams, and code that evolves in front of you.
III: coming soon, untitledEvolution paths, honest debt ledger, and what OCaml leaves behind in the way you think about any codebase — even ones not written in it.

Project Status #

The simulator (v0) is complete: JSON-driven scenarios, terminal-interactive harness, deterministic replay, colourised logging. The codebase is almost entirely hand-rolled; it relies on Jane Street’s OCaml stack throughout (Base, Core, ppx_jane) because I picked up the basics of the language by reading the book “Real World OCaml” ( my reading notes are here, also my first line of OCaml ever written is in the first commit for this project).

ArtefactCurrent Status
Part Ipublished
Part IISoft-launched (pre-RC)
Part III🗺️ planned
Codetagged v0-simple-paxos

RFC #

If you have ideas or feedback for making this series clearer, more demonstrative, or more useful — open an issue on Codeberg or write to hello@rtshkmr.com.

Always great to sign your emails, typically gets instant replies from me.

I’m Ritesh Kumar. My public key can be downloaded from here or here — alternatively just search for it:

# 1. Get my key:
gpg --keyserver keys.openpgp.org --search-keys 23BBA8476AA5F3FE0D944DD136F1C1A8EE1EB352

# 2. Paste your armor to your email too!
gpg --armor --export yourmail@example.com

2026


🐫 Songs of the OCaml Compiler Part II: A Design Tour

This is Part II of a series on learning OCaml by writing a Paxos simulator. We build on Part I’s abstract grammar and witness how subsystems take shape from it — guided, as it turns out, by the OCaml compiler I’m sure the magical source within the compiler is rooted in the underlying Hindley-Milner Type system which expects the programmer to exercise clearer type-discipline in exchange for superior inference capabilities that feel like an extension to the programmer’s own mind. I’ll put up a few words about what that experience has been like in Part III, coming soon itself.

·· 8991 words· 36–60 min read

2025


🐫 Songs of the OCaml Compiler Part I: A Proxy Project

This is Part I of a series on learning OCaml through building a Paxos simulator. Here, we stay away from the code entirely and focus on listening closely to the forces that will shape the architecture.

·· 4901 words· 20–33 min read