Skip to main content
  1. References/
  2. Algo Reference/
  3. 99 OCaml Problems/

Beginner Exercises

· 2242 words· 11 mins

Problems for beginners taken from here. Instead of just beinga store for solutions, this aims to actually go deeper and float all the learning takeaways that I had when completing the problem set. The code-snippets here can be tangled in emacs I will expose this code as an org-file publicly once I’ve completed the 99 problems. It will be hosted on Codeberg, hopefully it helps others. and can be executed directly.

Figure 1: Camels @ the San Diego Zoo (retrieved)

Lists #

The exercises are grouped by classic data-structures. Naturally, List is the one we start with to build our foundation.

1. Tail of a List #

Write a function last : 'a list -> 'a option that returns the last element of a list

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
let input = ["a" ; "b" ; "c" ; "d"]

let rec last ls = match ls with
  | [] -> None
  | [x] -> Some x
  | _ :: tail -> last tail;;

last input;;
last [];;
last ["only"];;

This is a classic 3-pattern, remember the use of cons for the head-tail separation.

Tail-Recursive, Accumulator Pattern, Continuation Passing Style #

  • Keeping things tail-recursive

    The rule of thumb:

    A recursive call is tail-recursive if and only if nothing happens after the recursive call returns. The result must be returned directly — no wrapping, no arithmetic, no cons-ing.

    Negative example — reversing a list naively: (reverse tail) @ [head] The @ happens after the recursive return → not tail-recursive → stack-overflows on long lists.

  • Accumulator Pattern

    For functions that are naturally not tail-recursive, introduce an accumulator parameter and pass the growing result forward rather than building it on the return path.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    (* non-tail-recursive: addition happens after the recursive return *)
    let rec sum = function
      | [] -> 0
      | x :: rest -> x + sum rest
    
    (* tail-recursive: accumulator carries result forward *)
    let rec sum_tr acc = function
      | [] -> acc
      | x :: rest -> sum_tr (acc + x) rest
    Code Snippet 1: tail-recursive positive and negative examples

    The accumulator pattern is defunctionalised CPS — it’s what we get when the continuation is simple enough to collapse into a single state variable.

  • Continuation Passing Style (CPS)

    The core shift: instead of returning a value, we receive the rest of the computation as a function (k) and decide how to invoke it.

    StyleShape
    Directlet y = f x in g y
    CPSf x (fun y -> g y k)
    1
    2
    3
    4
    5
    
    (* CPS sum: continuation k receives the final result *)
    let rec sum_cps lst k =
      match lst with
      | [] -> k 0
      | x :: rest -> sum_cps rest (fun r -> k (x + r))
    Code Snippet 2: cps-style example

    What CPS actually buys us (in order of practical importance):

    1. Tail recursion universally. Every call becomes a tail call — no hidden stack growth. This is why compilers use CPS internally.

    2. First-class control flow. “The rest of the program” is a function we hold. We can pass it, store it, call it multiple times, or not at all. This unlocks:

      • early exit without language support
      • backtracking / search (pass a k_success and k_failure)
      • exception handling without runtime magic
      • coroutines, async, generators
    3. Explicit, programmable execution order. In direct style the runtime controls when things return and how the stack unwinds. In CPS, we control all of that.

    The cost:

    • Readability drops. Nested lambdas, inverted flow — objectively harder to read.
    • Closure allocation. Each continuation is a heap-allocated closure → GC pressure.
    • Debugging is harder. Stack traces flatten; “who called who” disappears – all-too-familiar callback-hell
    • Usually not idiomatic OCaml. A simple accumulator handles 90% of cases cleanly.

    When CPS is actually worth it:

    • Complex early exits across multiple recursion layers
    • Backtracking / search (parsers, SAT solvers, Prolog-style)
    • Writing interpreters or compilers (explicit evaluation order)
    • Async / cooperative multitasking (deferred continuations)

    When to use the accumulator instead:

    • Any time the continuation is “add this to the running total” or similarly simple — collapse it into state, don’t heap-allocate a closure.

    The deeper reframe: A function doesn’t return a value. It receives the rest of the computation and decides how to proceed. Once we see it that way: exceptions = alternate continuations, loops = recursive continuations, async = deferred continuations.

    Relationship between the three:

    • Direct recursion: implicit stack, not tail-recursive
    • Accumulator: explicit state, tail-recursive, works when continuation is simple
    • CPS: explicit continuation, tail-recursive, works for arbitrary control flow
    • Accumulator = CPS with the continuation defunctionalised into a variable

2. Last Two Elements of a List #

Find the last two (last and penultimate) elements of a list.

1
2
3
4
5
6
7
8
9
let input = ["a" ; "b" ; "c" ; "d"];;

let rec last_two = function
  | [] | [_] -> None (* <-- collapsed the insufficient data cases into one arm *)
  | [x; y] -> Some (x, y)
  | _ :: tl -> last_two tl;;

last_two input;;
last_two ["a"];;

The list literal pattern [x; y] is sugar for x :: y :: []. When we need to look ahead by k elements, we write k binders in the terminal arm. This generalises cleanly to “last three”, “last four”, etc. — same skeleton, wider terminal arm.

3. N’th Element of a List #

Find the N’th element of a list.

1
2
3
4
5
6
7
8
let input = ["a"; "b"; "c"; "d"; "e"];;
let rec at i = function
  | _ when i < 0 -> None
  | [] -> None
  | x :: _ when i = 0 -> Some x
  | _ :: tl -> at (i - 1) tl;;

at 3 input;;
Code Snippet 3: My solution: implements safe-indexing, recursively.
1
2
3
let rec at k = function
    | [] -> None
    | h :: t -> if k = 0 then Some h else at (k - 1) t;; (* -- the arithmetic pattern is encoded within an if-else *)
Code Snippet 4: Ideal solution
my implementation (working, less clean)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
let input = ["a"; "b"; "c"; "d"; "e"];;

let rec at i ls = match ( i, ls ) with
  | (x, _) when x < 0 -> None
  | (_, []) -> None
  | (0, x :: _) -> Some x
  | (k, _ :: tl) when k > 0 -> at ( k - 1 ) tl;;

at 12 input;;
at 3 input;;
at ( -1 ) input;;

Singleton matches cons pattern #

My attempt had some redundancies:

  1. | (0, [x]) -> Some x gets subsumed into | (0, x :: _) -> Some x
    • this is because a singleton (i.e. [x]) will match the pattern: =x: _=

Balancing Guards #

The idiomatic OCaml preference: Encode constraints in the structure of the pattern (the shape of the data), not in guards.

Reach for guards only when the constraint is genuinely arithmetic or relational — something the type system cannot express structurally.

Reminders #

  • the negative input case needs to be handled

4. Length of a List #

Find the number of elements of a list.

OCaml standard library has List.length but we ask that you re-implement it. Bonus for a tail recursive solution.

Solutions:

1
2
3
4
5
6
let length list =
    let rec aux n = function
      | [] -> n
      | _ :: t -> aux (n + 1) t
    in
    aux 0 list;;
Code Snippet 5: Solution 1: Wrapper of a recursive helper may be nested within

This is the most idiomatic version. It keeps the name space clean by making the aux helper function scoped to the same function. It is tail-recursive based on our elaboration above on what tail-recursive means.

1
2
3
let rec len ?(acc=0) = function
  | [] -> acc
  | _ :: tl -> len ~acc:(acc+1) tl
Code Snippet 6: Solution 2: Cleaner for public APIs: use optional arg with default

I prefer this solution.

my initial solution

I have a recursive helper and a wrapper that calls it and abides by the API that the input example suggests.

1
2
3
4
5
6
7
8
let input = ["a"; "b"; "c"];;
let rec len_ acc = function
    | [] -> acc
    | hd :: tl -> len_ (acc + 1) tl;;

let len ls = len_ 0 ls;;

len input

this can be improved: the scope of the recursive helper may JUST be the wrapper itself – so we should nest it.

5. Reverse a List #

Reverse a list.

OCaml standard library has List.rev but we ask that you re implement it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
let input = [1;2;3];;

let rev ls =
  let rec aux ls acc = match ls with
      | [] -> acc
      | hd :: tl -> aux tl (hd :: acc)
    in
    aux ls [];;

rev input;;

Accumulator is the reversed prefix #

The mental model to lock in:

When reversing with an accumulator, the accumulator is the reversed prefix so far.

Each step takes the head of the remaining list and prepends it to that reversed prefix.

Toplevel evaluation is all-or-nothing #

If a let binding fails to typecheck, it is not defined at all. Subsequent errors (like “unbound value”) are often downstream symptoms, not the root cause.

This feels like a REPL-quirk where in the broken snippet below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
let input = [1;2;3];;

let rev ls =
  let rec aux ls acc = match ls with
      | [] -> acc
      | hd :: tl -> aux tl (acc :: hd) (* <-- this isn't right, can't cons list :: elem; should throw type error *)
    in
    aux ls [];;

rev input;;

(* actual error:
: Line 2, characters 0-3:
: 2 | rev input;;;;
:     ^^^
: Error: Unbound value rev
: Hint:   Did you mean ref?
*)

On the cons operator direction: a :: b always means “element on the left, list on the right.” The type is 'a -> 'a list -> 'a list.

If we ever find ourselves writing list :: element, that’s the tell that we’ve flipped it.

6. Palindrome #

Find out whether a list is a palindrome.

Hint: A palindrome is its own reverse.

1
2
3
4
5
6
7
8
9
let pos_input = ["x"; "a"; "m"; "a"; "x"]
let neg_input = ["a"; "b"];;

let is_palindrome ls =
    Stdlib.(=) ls (List.rev ls);; (* -- note: this explicit call is only because we're using top-level with Base defaults*)
    (* One can use either the rev function from the previous problem, or the built-in List.rev *)

is_palindrome pos_input;;
not (is_palindrome neg_input);;
My initial attempt

So this is a little naive, I attempted to do two checks.

It seems like it’s possible for us to just use pattern-matching (see ideal solution above).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
let pos_input = ["x"; "a"; "m"; "a"; "x"];;
let neg_input = ["a"; "b"]

let is_palindrome ls =
  let rev ls =
    let rec aux ls acc = match ls with
        | [] -> acc
        | hd :: tl -> aux tl (hd :: acc)
        in
        aux ls [] in
  let rec cmp a b = match (a, b) with
      | [], [] -> true
      | h_a :: t_a, h_b :: t_b -> if Stdlib.(=) h_a h_b then cmp t_a t_b else false (* -- namespacing cuz Base is default for my toplevel*)
  in
  if ls == [] then false else cmp ls (rev ls);;

is_palindrome pos_input

Structural = vs Physical Inequality == #

equality
structural (=)Compares values recursively. Works on lists, tuples, variants, records. This is what we want almost always.
physical (==)Compares memory addresses — are these the exact same object in the heap? Almost never what we want for value comparison.
  • Implicit behaviour: == may force int when ambiguous

    In the example below, we manually implement the cmp function but there’s an area that implicitly type-restricts the arguments to int:

    • ls = [ ]= \(\implies\) tries to make the type concrete and make it an int list
    Negative example
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    
    let pos_input = ["x"; "a"; "m"; "a"; "x"];;
    let neg_input = ["a"; "b"]
    
    let is_palindrome ls =
      let rev ls =
        let rec aux ls acc = match ls with
            | [] -> acc
            | hd :: tl -> aux tl (hd :: acc)
            in
            aux ls [] in
      let rec cmp a b = match (a, b) with
          | [], [] -> true
          | h_a :: t_a, h_b :: t_b -> if h_a = h_b then cmp t_a t_b else false
      in
      if ls == [] then false else cmp ls (rev ls);;
    
    is_palindrome pos_input
    Line 2, characters 14-23:
    2 | is_palindrome pos_input;;
                      ^^^^^^^^^
    Error: The value pos_input has type string/2 list/2
           but an expression was expected of type int/2 list/2
           Type string/2 is not compatible with type int/2
           File "_none_", line 1:
             Definition of type int/2
           File "_none_", line 1:
             Definition of type list/2
           File "_none_", line 1:
             Definition of type string/2
  • Language Provides the Structural Equality Logic

    The = recursively walks the data structure and compares element by element. For [1;2;3] = [1;2;3], the runtime traverses both lists in lockstep, comparing heads, recursing on tails, returning true only if every element matches and both lists terminate simultaneously. We don’t write that logic because the language provides it generically for any type.

  • toplevel quirks: Base vs Stdlib equality gotcha

    So in my utop, I’ve originally defaulted to JaneStreet’s Base, where the = differs from Stdlib.(=). This is an annoying gotcha.

    Jane Street’s Base library replaces polymorphic (=) with a version that requires explicit type-specific comparators. If utop is loaded with Base, writing ls = List.rev ls fails because Base(=) won’t accept 'a list without a compactor.

    Fix: use Stdlib.()= explicitly to get OCaml’s built-in structural equality.

    let is_palindrome ls = Stdlib.() ls (List.rev ls)=

    This only surfaces in a Base-defaulted utop session. In a normal .ml file without open Base, plain ()= works fine.

7. Run-length Encoding #

Run-length encoding (RLE) is a form of lossless data compression in which runs of data (consecutive occurrences of the same data value) are stored as a single occurrence of that data value and a count of its consecutive occurrences, rather than as the original run. For example, a sequence of “green green green green green” in an image built up from colored dots could be shortened to “green x 5”.

# encode ["a"; "a"; "a"; "a"; "b"; "c"; "c"; "a"; "a"; "d"; "e"; "e"; "e"; "e"];;
- : (int * string) list =
[(4, "a"); (1, "b"); (2, "c"); (2, "a"); (1, "d"); (4, "e")]
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15

let input = ["a"; "a"; "a"; "a"; "b"; "c"; "c"; "a"; "a"; "d"; "e"; "e"; "e"; "e"];;

let encode ls =
  let rec aux accum ls =
    match (accum, ls) with
      | _, [] -> accum
      | [] , hd :: tl -> aux [(1, hd)] tl
      | (freq, symbol) :: a_tl, hd :: tl -> if Stdlib.(=) hd symbol
                                            then aux ( (freq + 1, symbol) :: a_tl ) tl (* prepend @ O(1)*)
                                            else  aux ((1, hd) :: accum ) tl (* prepend @ O(1) *)
  in
  ls |> aux [] |> List.rev;; (* List.rev @ O(n) *)

encode input;;

The official answer:

1
2
3
4
5
6
7
let encode list =
    let rec aux count acc = function
      | [] -> [] (* Can only be reached if original list is empty *)
      | [x] -> (count + 1, x) :: acc
      | a :: (b :: _ as t) -> if a = b then aux (count + 1) acc t          (encode_as_binding_lookahead)
                              else aux 0 ((count + 1, a) :: acc) t in
    List.rev (aux 0 [] list);;
an alternative, official answer that uses pack
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
let pack list =
    let rec aux current acc = function
      | [] -> []    (* Can only be reached if original list is empty *)
      | [x] -> (x :: current) :: acc
      | a :: (b :: _ as t) ->
         if a = b then aux (a :: current) acc t
         else aux [] ((a :: current) :: acc) t  in
    List.rev (aux [] [] list);;

let encode list =
    List.map (fun l -> (List.length l, List.hd l)) (pack list);;

Idiom: Accumulator head as “current working state” #

This is a general idiom, where the in-process group being built (the (freq, symbol) :: a_tl, hd :: tl pattern arm) has the accumulator’s head within it.

The accumulator is doing 2 things here:

  1. the completed-groups history
  2. the live group at the front

We’ll see this idiom show up again when doing grouping problems, run-detection, segmentation over a list

Recognition signal for this idiom: when we need to “look at the last thing we committed” while processing the next element.

Idiom: as binding for peek/lookahead pattern-matching #

The as binding we see in the model ans (ref) – b :: _ as t allows us to:

  • lookahead at the structure of the tail t, and still have access to its head, b
  • we’re doing a next-element peek, we don’t need to reconstruct t again when we pass the tail to the recursive call .

The pattern | a :: (b :: _ as t) -> if a = b then aux ... t else aux ... t is the idiom here \(\implies\) OCaml lookahead

Best for whenever we need to compare the current element to the next one without consuming the next, this is the shape.

Cost Trade-offs and why pre-pending + List.rev @ the end is correct #

  1. We have a trade-off where at each step, our prepend happens @ \(O(1)\) and then we do a List.rev that runs at \(O(n)\)
  2. We shouldn’t be tempted and:
    1. append at front accum @ [(1, hd)] to avoid the final reverse because that append happens at \(O(n)\) per step \(\implies\) the whole function would end up being \(O(n^{2})\)

Using sentinel values #

We could attempt to use a sentinel that would generalise the init arm into the general case arm – it’s like instead of saying “if no current group, start one”, we say “pretend there’s always a current group, even before the first element” – and that “pretend” value is the sentinel.

this is awkward because the “initialisation” has just been shifted to outside the loop – which is the practical equivalent of a sentinel
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
let encode ls =
  let rec aux accum ls =
    match (accum, ls) with
    | (freq, symbol) :: a_tl, hd :: tl ->
        if Stdlib.(=) hd symbol
        then aux ((freq + 1, symbol) :: a_tl) tl
        else aux ((1, hd) :: accum) tl
    | _, [] -> accum
    | [], _ -> failwith "unreachable"
  in
  match ls with
  | [] -> []
  | hd :: tl ->
      aux [(1, hd)] tl |> List.rev

Though I don’t really like the use of runtime failwith here.

failwith "unreachable" is a code smell #

It converts a static guarantee into a runtime assertion.

The compiler cannot verify it. Refactoring can silently invalidate it. If we ever refactor the match structure and our assertion becomes wrong, we get a runtime crash instead of a compile error.

When we reach for it, treat it as a signal to restructure the match so the impossible arm simply doesn’t exist.

Design lession: the pack decomposition #

The pack decomposition is a design lesson.

The pack-then-map approach reveals an important learning:

  • encode is actually two separable concerns

    1. grouping (pack) and

    2. summarising each group (map over length + head).

      Our single-pass solution fuses these, which is more efficient but less composable.

      The pack version is:

      • easier to test
      • easier to modify (what if you wanted the groups themselves, not the counts?),
      • easier to reason about
  • The tradeoff is one extra \(O(n)\) pass.

The general design principle to learn:

When a problem has two natural phases, then implementing them separately and composing them is often cleaner than fusing. Fuse only when you have a measured performance reason.

nice.

This is incomplete, timebox exceeded – will be updated in the next 24h.

TODO Arithmetic #

TODO Binary Trees #

TODO Multi-way Trees #

TODO Graphs #